1 Introduction

Runtime Verification (RV) [35] is a lightweight verification technique that checks whether the System Under Scrutiny (SUS) satisfies a correctness property by analysing its current execution. It has its origins in model checking, as a more scalable (yet still formal) approach to program verification where state explosion problems (which are inherent to model checking) are mitigated [34, 35]. RV is often used to complement other verification techniques such as theorem proving, model checking and testing, in a multi-pronged approach towards ensuring system correctness [3,4,5, 21]. The technique has fostered a number of verification tools such as [6, 10, 12, 16, 18,19,20,21, 27, 30, 37,38,39], to name but a few. It has also been used for a variety of applications, ranging from checking the executions of the NASA Mars Rover software [12] and other autonomous research vehicles [28], to more mundane tasks such as checking for rule violations in financial systems [8], video games [40] and electronic examinations [29].

At its core, RV generally assumes a logic describing the correctness specifications that are expected to be satisfied by the SUS. From these specifications, programs called monitors are generated and instrumented to run with the SUS, so as to analyse its current execution (expressed as a trace of events) and infer any system violations or satisfactions w.r.t. these specifications (see Fig. 1). Violation and satisfaction verdicts are typically considered to be definite, i.e., cannot be retracted upon observing further system events. Monitors are themselves computational entities that incur runtime costs, and considerable amounts of effort in RV is devoted to the engineering necessary to keep runtime overheads at feasible levels. Yet, a RV set-up such as the one depicted in Fig. 1 raises additional questions that warrant equal consideration.

Fig. 1.
figure 1

Runtime monitor synthesis and operational set-up.

We Need to Talk About Monitors. The expressiveness of the verification technique w.r.t. the correctness properties that can be described by the logic—an attribute often termed as monitorability—should be a central question in RV. In fact, specification logics that are not necessarily wedded to the RV technique may express properties that cannot be monitored for at runtime. Particularly, certain aspects of a chosen specification logic can potentially reduce the analytical capabilities of the monitors, such as the ability to express properties describing multiple or infinite executions of a system. In such cases, the (finite) trace exhibited by the running system (see Fig. 1) may not contain sufficient execution information so as to enable the monitor to determine whether a property under analysis is violated or satisfied by the SUS. Therefore, being able to identify which properties are monitorable is essential for devising an effective strategy in a multi-pronged verification approach that tailors the verification method used to each correctness specification.

Another fundamental question raised by RV concerns monitor correctness. Generally, monitors are considered to be part of the trusted computing base, and are thus expected to exhibit correct behaviour. When this is not the case, erroneous monitors could invalidate any runtime analysis performed, irrespective of the low overheads a monitor may brandish. On top of that, what is actually expected of these monitors is seldom defined in precise terms, making it hard to ascertain monitor correctness. For instance, in a typical RV set-up such as that of Fig. 1, one would expect detection soundness, whereby any rejections (resp. acceptances) flagged by the monitor imply that the system indeed violates (resp. satisfies) the property being monitored. Other settings may also stipulate detection completeness, by which all property violations (resp. satisfactions) are flagged accordingly by the monitor observing the SUS. One may also require attributes that are independent of the specification logic chosen, such as passivity (i.e., the absence of monitor interference with execution of the SUS), and determinism (i.e., whether the monitor consistently yields the same verification outcome for the same observations).

This paper sheds light on one possible approach for addressing these questions. In particular, it showcases the work embarked upon in Theoretical Foundations for Monitorability (TheoFoMon), a three-year project funded by the Icelandic Research Fund with the aim of investigating the expressiveness and correctness issues outlined above. To maintain a general approach, the aforementioned questions are studied in the context of the Hennessy-Milner Logic with recursion (\(\mu \mathrm {\textsc {HML}}\)) [33], a reformulation of the expressive and well-studied modal \(\mu \)-calculus [32]—the logic is agnostic of the verification technique used, and can embed widely used temporal logics such as \(\text {CTL}^*\) [9, 17]. Labelled Transition Systems (LTSs) [2], formalisms that have been used to describe a vast range of computational phenomena, are employed to abstractly model the behaviour of systems and monitors in terms of graphs. This generality serves two goals. On the one hand, the LTS operational formalism abstracts away instance details of specific RV settings, facilitating the distillation of the core concepts at play. On the other hand, the considerable expressiveness of the logic immediately extends any findings and observations to other logics that can be embedded within it.

Paper Overview. The preliminaries in Sect. 2 present the basics for modelling systems and provide an introduction to our touchstone logic. Section 3 presents our monitor and instrumentation models. Section 4 links the properties specified by the logic and the verdicts reached by monitors, whereas Sect. 5 discusses notions of monitor correctness. Section 6 surveys extensions on the monitorability and correctness results presented, and Sect. 7 concludes.

2 Preliminaries: Groundhog Day

We provide a brief outline of the main technical machinery used by our exposition, namely, the system model and the specification logic. Interested readers are encouraged to consult [2, 33] for more details.

The Model. LTS s are directed graphs with labelled edges modelling the possible behaviours that can be exhibited by an executing system. Formally, a LTS consists of the triple \(\langle \textsc {Sys},(\textsc {Act} \cup \{\tau \}),\!\overset{}{\longrightarrow }\!\rangle \) comprised of:

  • a set of system states \(p,q,r \in \textsc {Sys} \) (every system state may also be used to denote a system that starts executing from that state),

  • a set of visible actions \(\alpha \in \textsc {Act} \) and a distinguished silent action \(\tau \), where \(\tau \not \in \textsc {Act} \) and \(\mu \) ranges over \((\textsc {Act} \cup \{\tau \})\), and finally,

  • a ternary transition relation between states labelled by actions; we write \(p \!\overset{\mu }{\longrightarrow }\!q \) in lieu of \(\langle p,\mu ,q \rangle \in \ \!\overset{}{\longrightarrow }\!\).

The notation is written when \({p \!\overset{\mu }{\longrightarrow }\!q}\) for no system state \(q\). Additionally, \(p \!\overset{}{\Longrightarrow }\!q \) denotes a sequence of silent actions \(p (\overset{\tau }{\longrightarrow })^{*}q \) from state \(p\) to \(q\), whereas \(p \!\overset{\alpha }{\Longrightarrow }\!q \), is written in place of \(p \,\Longrightarrow \cdot \!\overset{\alpha }{\longrightarrow }\!\cdot \Longrightarrow \,q \), to represent a visible action \(\alpha \) that may be padded by preceding and succeeding silent actions. We occasionally use the notation \({{\mathbf {\mathsf{{0}}}}}\) to refer to a system (state) that is deadlocked and exhibits no actions (not even silent ones). We let traces, \(t,u\in \textsc {Act} ^*\), range over sequences of visible actions and write sequences of transitions as , where \({t= \alpha _1,\ldots ,\alpha _n}\). As usual, \(\epsilon \) denotes the empty trace.

Fig. 2.
figure 2

The LTS depicting the behaviour of three server implementations \(p\), \(q\) and \(r\).

Example 1

The directed graph in Fig. 2 depicts a LTS describing the implementation of three servers whose events are represented by the set of visible actions \(\textsc {Act} =\{\texttt {req}, \texttt {ack}, \texttt {lim}\}\). State \(p\) shows the simplest possible implementation of a server that receives (req) and acknowledges (ack) client requests repeatedly. State \(q\) denotes an extension on this implementation that may reach a termination limit (transition lim) after a number or serviced requests (i.e., req followed by ack). Finally state \(r\) represents an unpredictable server implementation that occasionally acknowledges a preceding client request twice. \(\blacksquare \)

Fig. 3.
figure 3

The syntax and semantics of \(\mu \mathrm {\textsc {HML}}\).

The Logic. \(\mu \mathrm {\textsc {HML}}\) [2, 33] is a branching-time logic that can be used to specify correctness properties over systems modelled in terms of LTS s. Its syntax, given in Fig. 3, assumes a countable set of logical variables \(X,Y \in \textsc {LVar} \), thereby allowing formulae to recursively express largest and least fixpoints using \({\mathbf {\mathsf{{{max}}}}}\,X.\varphi \) and \({\mathbf {\mathsf{{{min}}}}}\,X.\varphi \) resp.; these constructs bind free instances of the variable \(X \) in \(\varphi \) and induce the usual notions of free and closed formulae—we work up to alpha-conversion of bound variables. In addition to the standard constructs for truth, falsity, conjunction and disjunction, the \(\mu \mathrm {\textsc {HML}}\) syntax includes the necessity and possibility modalities, one of main distinctive features of the logic.

The semantics of \(\mu \mathrm {\textsc {HML}}\) is defined in terms of the function mapping formulae \(\varphi \) to the set of LTS states \(S \subseteq \textsc {Sys}\) satisfying them. Figure 3 describes the semantics for open and closed formulae, and uses a map \(\rho \in \textsc {LVar} \rightharpoonup 2^{\textsc {Sys}}\) from variables to sets of system states to enable an inductive definition on the structure of the formula \(\varphi \). The formula \({{\mathbf {\mathsf{{tt}}}}}\) is satisfied by all processes, while \({{\mathbf {\mathsf{{ff}}}}}\) is satisfied by none; conjunctions and disjunctions bear the standard set-theoretic meaning of intersection and union. Necessity formulae \([\alpha ]\varphi \) state that for all system executions producing event \(\alpha \) (possibly none), the subsequent system state must then satisfy \(\varphi \) (i.e., \(\forall p' \), \(p \!\overset{\alpha }{\Longrightarrow }\!p' \text { implies }p' \in \llbracket \varphi ,\rho \rrbracket \) must hold). Possibility formulae \(\langle \alpha \rangle \varphi \) require the existence of at least one system execution with event \(\alpha \) whereby the subsequent state then satisfies \(\varphi \) (i.e., \(\exists p' \), \(p \!\overset{\alpha }{\Longrightarrow }\!p' \text { and }p' \in \llbracket \varphi ,\rho \rrbracket \) must hold). The recursive formulae \({\mathbf {\mathsf{{{max}}}}}\,X.\varphi \) and \({\mathbf {\mathsf{{{min}}}}}\,X.\varphi \) are resp. satisfied by the largest and least set of system states satisfying \(\varphi \). The semantics of recursive variables \(X\) w.r.t. an environment instance \(\rho \) is given by the mapping of \(X\) in \(\rho \), i.e., the set of processes associated with \(X\). Closed formulae (i.e., formulae containing no free variables) are interpreted independently of the environment \(\rho \), and the shorthand \(\llbracket \varphi \rrbracket \) is used to denote \(\llbracket \varphi ,\rho \rrbracket \), i.e., the set of system states in Sys that satisfy \(\varphi \). In view of this, we say that a system (state) \(p\) satisfies some closed formula \(\varphi \) whenever \(p \in \llbracket \varphi \rrbracket \), and conversely, that it violates \(\varphi \) whenever \(p \notin \llbracket \varphi \rrbracket \). We highlight two basic formulae that are used pervasively in \(\mu \mathrm {\textsc {HML}}\): \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) describes systems that can produce action \(\alpha \), while \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \) describes systems that cannot produce action \(\alpha \). Note also that \([\alpha ]{{\mathbf {\mathsf{{tt}}}}} \) is semantically equivalent to \({{\mathbf {\mathsf{{tt}}}}}\) whereas \(\langle \alpha \rangle {{\mathbf {\mathsf{{ff}}}}} \) equates to \({{\mathbf {\mathsf{{ff}}}}}\).

Example 2

Recall the server implementations \(p\), \(q\) and \(r\) depicted in Fig. 2.

$$\begin{aligned} \varphi _1&= \langle \texttt {req}\rangle \langle \texttt {ack}\rangle {\langle \texttt {req}\rangle {{\mathbf {\mathsf{{tt}}}}} }&\varphi _2&= [\texttt {lim}][\texttt {req}]{{\mathbf {\mathsf{{ff}}}}} \\ \varphi _3&= [\texttt {req}][\texttt {ack}] {\langle \texttt {req}\rangle {{\mathbf {\mathsf{{tt}}}}} }&\varphi _4&= [\texttt {req}][\texttt {ack}] {\langle \texttt {req}\rangle {{\mathbf {\mathsf{{tt}}}}} }\wedge \langle \texttt {lim}\rangle {{\mathbf {\mathsf{{tt}}}}} \end{aligned}$$
$$\begin{aligned} \varphi _5 = {\mathbf {\mathsf{{{max}}}}}\,X.\bigl ([\texttt {req}]([\texttt {ack}]X \wedge [\texttt {ack}][\texttt {ack}]{{\mathbf {\mathsf{{ff}}}}} ) \bigr ) \\ \varphi _6 = {\mathbf {\mathsf{{{min}}}}}\,X.(\langle \texttt {req}\rangle \langle \texttt {ack}\rangle X \vee \langle \texttt {lim}\rangle {{\mathbf {\mathsf{{tt}}}}} ) \end{aligned}$$

Formula \(\varphi _1\) describes systems that can produce a req after some serviced request (i.e., a req followed by a ack); all server implementations \(p,q \) and \(r \) satisfy this property. Formula \(\varphi _2\) states that whenever a system produces the event lim, it cannot produce any req actions. Again all three implementations satisfy this property where, in particular, \(p\) and \(r\) satisfy this trivially since both never produce a lim event. Formula \(\varphi _3\) strengthens property \(\varphi _1\) by requiring that a system can produce a req after any serviced request. While \(p \) and \(q \) satisfy this requirement, implementation \(r \) violates this property at any time it (non-deterministically) transitions to state \(r ''\). Formula \(\varphi _4\) strengthens this property further, by requiring the implementation to be capable of producing the lim event; only \(q\) satisfies this property.

Formula \(\varphi _5\) specifies a (recursive) safety property that prohibits a system from producing duplicate acknowledgements in answer to client requests after any number of serviced requests. System \(r\) violates \(\varphi _5\) via any trace in the regular language \((\texttt {req}.\texttt {ack})^{+}.\texttt {ack}\). Finally, \(\varphi _6\) describes systems that can reach a service limit after a number (possibly zero) of serviced requests. System \(q\) satisfies \(\varphi _6\) immediately, as opposed to \(p\) and \(r\), which never reach such a state after any number of serviced requests. We note that if the minimal fixpoint recursion operator in \(\varphi _6\) is substituted with a maximal fixpoint, i.e., \({\mathbf {\mathsf{{{max}}}}}\,X.(\langle \texttt {req}\rangle \langle \texttt {ack}\rangle X \vee \langle \texttt {lim}\rangle {{\mathbf {\mathsf{{tt}}}}} )\), implementations \(p\) and \(r\) would also satisfy it via the infinite sequence of events \(\texttt {req}.\texttt {ack}.\texttt {req}.\texttt {ack}\ldots \) \(\blacksquare \)

3 Dial M for Monitor

In [25, 26], monitors are also perceived as LTSs, expressed through the syntax of Fig. 5. It consists of three verdict constructs, \({\mathbf {\mathsf{{{yes}}}}}\), \({{\mathbf {\mathsf{{no}}}}}\) and \({{\mathbf {\mathsf{{end}}}}}\), resp. denoting acceptance, rejection and termination (i.e., an inconclusive outcome). The syntax also includes action \(\alpha \) prefixing, mutually-exclusive (external) choice and recursion, denoted by \({{\mathbf {\mathsf{{rec}}}}}\,x.m \), acting as a binder for the recursion variables x in m. All recursive monitors are assumed to be guarded, meaning that all occurrences of bound variables occur under the context of an action prefix. In the sequel, we assume closed monitor terms, where all variable occurrences are bound.

Every monitor term may be given a LTS semantics via the transition rules shown in Fig. 5. The rules are fairly standard: for example, \(m _1 \,\textsf {+}\, m _2 \xrightarrow {\;\mu \;} m _3\) (for some \(m _3\)) if either \(m _1 \xrightarrow {\;\mu \;} m _3\) or \(m _2 \xrightarrow {\;\mu \;} m _3\) (by rules SelL or SelR). The only transition rule worth drawing attention to, Ver, specifies that verdicts v may transition with any \(\alpha \in \textsc {Act} \) to return to the same state, modelling the assumed requirement from Fig. 1 that verdicts are irrevocable.

Fig. 4.
figure 4

The LTS representation for monitor \(m _1=(\texttt {req}.{\texttt {ack}.{\texttt {ack}.{{\mathbf {\mathsf{{{yes}}}}}}}} \,\textsf {+}\, \texttt {lim}.{{{\mathbf {\mathsf{{no}}}}}})\).

Fig. 5.
figure 5

Monitors and instrumentation.

Example 3

Monitor \(m _1=(\texttt {req}.{\texttt {ack}.{\texttt {ack}.{{\mathbf {\mathsf{{{yes}}}}}}}} \,\textsf {+}\, \texttt {lim}.{{{\mathbf {\mathsf{{no}}}}}})\) can be represented diagrammatically via the LTS depicted in Fig. 5. This LTS is derived using the monitor dynamics from Fig. 5, by applying the rules that match the edge label and the structure of the term under consideration. Every term that results from each rule application essentially represents a LTS state. For example, the edge labelled lim from node \(m _1\) to node \({{\mathbf {\mathsf{{no}}}}}\) in Fig. 4 follows from the transition

$$\begin{aligned}&{\texttt {req}.{\texttt {ack}.{\texttt {ack}.{{\mathbf {\mathsf{{{yes}}}}}}}} \,\textsf {+}\, \texttt {lim}.{{{\mathbf {\mathsf{{no}}}}}}} \;\xrightarrow {\;\,\texttt {lim}\,\;}\; {{{\mathbf {\mathsf{{no}}}}}} \end{aligned}$$

derived by applying rule SelR and rule Act. From the LTS in Fig. 4, we can see that monitor \(m _1\) reaches an acceptance verdict whenever it analyses the sequence of events \(\texttt {req}.\) \(\texttt {ack}.\) ack, and a rejection verdict when the single event lim is analysed. \(\blacksquare \)

In [26], a system p can be instrumented with a monitor \(m\) (referred to hereafter as a monitored system and denoted as \(m \triangleleft \, s\)) by composing their respective LTS s. The semantics of \(m \triangleleft \, s\) is defined by the instrumentation rules in Fig. 5. We highlight the generality of the instrumentation relation \( \triangleleft \, \). It is parametric w.r.t. the system and monitor abstract LTSs and is largely independent of their specific syntax: it only requires the monitor LTS to contain an inconclusive (persistent) verdict state, \({{\mathbf {\mathsf{{end}}}}}\). Instrumentation is asymmetric, and the monitored system transitions with an observable event only when the system is able to exhibit said event. The suggestive symbol \( \triangleleft \, \) alludes to this unidirectional composition, indicating that trace events flow from the system into the monitor. The monitor is in this sense passive as it does not interact with the system, but transitions in tandem with it according to the rules in Fig. 5. When the system exhibits an observable event that can be analysed by the monitor, the two synchronise and transition in lockstep according to their respective LTSs via the rule Mon. When the monitor cannot analyse the aforementioned event,Footnote 1 and is it not able to transition internally to a state that permits it to do so (i.e., it is already stable, \(m \not \!\xrightarrow {\;\tau \;}\)), the system is not blocked by the instrumentation. Instead, it is allowed to transition, whereas the monitor is aborted to the inconclusive verdict \({{\mathbf {\mathsf{{end}}}}}\), as per rule Ter. The system-monitor synchronisation is limited to visible actions, and both system and monitor can transition independently w.r.t. their own internal \(\tau \)-action (rules AsS and AsM).

Fig. 6.
figure 6

The LTS depicting the behaviour of the monitored system \(m \triangleleft \, r \).

Example 4

Fig. 6 depicts the LTS of the monitored system \(m _1 \triangleleft \, r \) that results from the composition of system \(r\) from Fig. 2 with monitor \(m _1\) from Fig. 4. Any verdict that \(m _1\) arrives at is subject to the execution path that \(r\) decides to follow at runtime. In Fig. 6, an acceptance can never be reached since \(r\) does not produce event lim. Furthermore, when event req is exhibited by \(r\), the monitored system may non-deterministically transition to either or (cases A and B in Fig. 6)—this impinges on whether \(m _1\) reaches an acceptance verdict or not.

For case A, state \(r '\) generates an ack event followed by req; the monitor at this stage, \(\texttt {ack}.\) \({\mathbf {\mathsf{{{yes}}}}}\), is not expecting a req event, but ack instead. It therefore aborts monitoring, reaching the inconclusive verdict \({{\mathbf {\mathsf{{end}}}}}\) using the instrumentation rule Ter (Fig. 5). If instead, the monitored system \(m _1 \triangleleft \, r\) transitions to (case B), it then reaches an acceptance verdict after analysing two consecutive ack events. In either outcome, the monitor verdict is preserved once it is reached via rule Ver. \(\blacksquare \)

4 Sense and Monitorability

To understand monitorability in our setting, we need to relate acceptances (\({\mathbf {\mathsf{{{yes}}}}}\)) and rejections (\({{\mathbf {\mathsf{{no}}}}}\)) reached by a monitor \(m\) when monitoring a system \(p\) with satisfactions (\(p \in \llbracket \varphi \rrbracket \)) and violations (\(p \not \in \llbracket \varphi \rrbracket \)) for that system w.r.t. some closed \(\mu \mathrm {\textsc {HML}}\) formula \(\varphi \). This will, in turn, allow us to determine when a monitor \(m\) represents (in some precise sense) a property \(\varphi \).

Example 5

Consider the simple monitor \(m _2 =\texttt {lim}.{{{\mathbf {\mathsf{{no}}}}}}\) which rejects all the systems that produce the event lim. Since any such system \(p\) would necessarily violate the property \(\varphi _7=[\texttt {lim}]{{\mathbf {\mathsf{{ff}}}}} \), i.e., \(p \not \in \llbracket [\texttt {lim}]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \), there exists a tight correspondence between any system rejected by \(m _2\) and the systems violating \(\varphi _7\). By contrast, the more elaborate monitor \(m _3 = {{\mathbf {\mathsf{{rec}}}}}\,x.(\texttt {req}.{\texttt {ack}.{x}} \,\textsf {+}\, \texttt {lim}.{{\mathbf {\mathsf{{{yes}}}}}})\) reaches the acceptance verdict for systems that exhibit a trace consisting of a sequence of serviced requests \((\texttt {req}.{} \texttt {ack})^*\) followed by the event lim. It turns out that this monitor can only reach an acceptance for systems that satisfy the property \(\varphi _6 = {\mathbf {\mathsf{{{min}}}}}\,X.(\langle \texttt {req}\rangle \langle \texttt {ack}\rangle X \vee \langle \texttt {lim}\rangle {{\mathbf {\mathsf{{tt}}}}} )\) from Example 2. Stated otherwise, there is a correspondence between the systems satisfying \(\varphi _6\), i.e., \(p \in \llbracket \varphi _6\rrbracket \), and those accepted by monitor \(m _3\). \(\blacksquare \)

However such correspondences are not so clear for certain monitors.

Example 6

Monitor \(m _1\) from Example 3 reaches an acceptance verdict when composed with a system \(p _\bot \) that exhibits the trace \(\texttt {req}.\) \(\texttt {ack}.\) ack, and a rejection verdict for the same system \(p _\bot \) when it exhibits the trace lim (i.e., \(m _1 \triangleleft \, p _\bot \) and ). If we associate \(m _1\) with a correctness property such as \(\varphi _8=\langle \texttt {req}\rangle \langle \texttt {ack}\rangle {\langle \texttt {ack}\rangle {{\mathbf {\mathsf{{tt}}}}} } \wedge [\texttt {lim}]{{\mathbf {\mathsf{{ff}}}}} \) and attempt to link acceptances to satisfactions and rejections to violations (as in Example 5) we end up with a logical contradiction, namely that \(p _\bot \in \llbracket \varphi _8\rrbracket \) and \(p _\bot \not \in \llbracket \varphi _8\rrbracket \). \(\blacksquare \)

A correspondence between monitor judgements and \(\mu \mathrm {\textsc {HML}}\) properties that relies on the following predicates is established in [26]:

Definition 1

(Sound Monitoring). A monitor \(m \) monitors soundly for the property represented by the formula \(\varphi \), denoted as \(\textsf {smon}(m,\varphi )\), whenever for every system \(p \in \textsc {Sys} \) the following conditions hold: (i) \( \textsf {acc}(p,m) \; \text {implies}\; p \in \llbracket \varphi \rrbracket \; \), (ii) \( \textsf {rej}(p,m) \; \text {implies}\; p \not \in \llbracket \varphi \rrbracket \). \(\blacksquare \)

Definition 1 universally quantifies over all system states \(p\) satisfying a formula \(\varphi \) that is soundly monitored by a monitor \(m\) (this may, in general, be an infinite set). It also rules out contradicting monitor verdicts. For whenever the predicate \(\textsf {smon}(m,\varphi )\) holds for some monitor \(m\) and formula \(\varphi \), and there exists some system \(p\) where \(\textsf {acc}(p,m)\), it must be the case that \(p \in \llbracket \varphi \rrbracket \) by Definition 1. Thus, from the logical satisfaction definition we have \(\lnot (p \not \in \llbracket \varphi \rrbracket )\), and by the contrapositive of Definition 1, we must also have \(\lnot \textsf {rej}(p,m)\). A symmetric argument also applies for any system \(p\) where \(\textsf {rej}(p,m)\), from which \(\lnot \textsf {acc}(p,m)\) follows. Sound monitoring is arguably the least requirement for relating a monitor with a logical property. Further to this, the obvious additional stipulation would be to ask for the dual of Definition 1, namely complete monitoring for \(m \) and \(\varphi \). Intuitively, this would state that for all p, \(p \in \llbracket \varphi \rrbracket \) implies \(\textsf {acc}(p,m)\), and also that \(p \not \in \llbracket \varphi \rrbracket \) implies \(\textsf {rej}(p,m)\). However, such a demand turns out to be too stringent for a large part of the logic presented in Fig. 3.

Example 7

Consider the core basic formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \), describing processes that can perform action \(\alpha \). One could demonstrate that the simple monitor \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) satisfies the condition that, for all systems \(p \), \(p \in \llbracket \varphi \rrbracket \text { implies }\textsf {acc}(p,m)\). However, for this formula, no sound monitor exists satisfying the condition that whenever \(p \not \in \llbracket \varphi \rrbracket \) then \(\textsf {rej}(p,m)\). For assume that one such (sound) monitor \(m\) exists satisfying this condition. Since \({{\mathbf {\mathsf{{0}}}}}\not \in \llbracket \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \rrbracket \), then \(\textsf {rej}({{\mathbf {\mathsf{{0}}}}},m)\) follows by our assumption. This means that this particular monitor can reach a rejection without needing to observe any actions (i.e., since \({{\mathbf {\mathsf{{0}}}}}\) does not produce any actions, we have \(m \!\overset{}{\Longrightarrow }\!{{\mathbf {\mathsf{{no}}}}} \)). Such a statement would, in turn, imply that \(\textsf {rej}(\alpha .{{{\mathbf {\mathsf{{0}}}}}},m)\) also holds (because \(m\) is able to reach a rejection verdict for any system) although, clearly, \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \in \llbracket \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \rrbracket \). This makes \(m\) unsound, contradicting our initial assumption that \(m \) was sound. \(\blacksquare \)

A dual argument to that of Example 7 can be carried out for another core basic formula in \(\mu \mathrm {\textsc {HML}}\), namely \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \), describing the property of not being able to produce action \(\alpha \). Although there are sound monitors \(m \) satisfying the condition that for all systems p, if \({p \not \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket } \text { then }\textsf {rej}(p,m)\), there are none that also satisfy the condition that for any system \(p\), \(\text {if } p \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \text { then }\textsf {acc}(p,m)\).

The counterarguments posed for the core formulae \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) and \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \) are enough evidence to convince us that requiring complete monitoring would exclude a large number of useful formulae in \(\mu \mathrm {\textsc {HML}}\). In fact, the complete monitoring requirement would severely limit correspondence to formulae that are semantically equivalent to \({{\mathbf {\mathsf{{tt}}}}}\) and \({{\mathbf {\mathsf{{ff}}}}}\) only—admittedly, this would not be very useful. In view of this, we define a weaker form of completeness where we require completeness w.r.t. either logical satisfactions or violations, but not both.

Definition 2

(Partially-Complete Monitoring). A monitor \(m\) can monitor for a property \(\varphi \) in a satifaction-complete, \(\textsf {scmon}(m,\varphi )\), or a violation-complete, \(\textsf {vcmon}(m,\varphi )\), manner. These are defined as follows:

$$\begin{aligned} \textsf {scmon}(m,\varphi )&\;\mathop {=}\limits ^{{\mathrm {def}}}\; \forall p. \,p \in \llbracket \varphi \rrbracket \text { implies }\textsf {acc}(p,m) \quad \quad \text {(satisfaction-complete)}\\ \textsf {vcmon}(m,\varphi )&\;\mathop {=}\limits ^{{\mathrm {def}}}\; \forall p. \, p \not \in \llbracket \varphi \rrbracket \text { implies }\textsf {rej}(p,m) \quad \quad \text {(violation-complete)} \end{aligned}$$

A monitor \(m\) monitors for formula \(\varphi \) in a partially-complete manner, denoted as \(\textsf {cmon}(m,\varphi )\), when either \(\textsf {scmon}(m,\varphi )\) or \(\textsf {vcmon}(m,\varphi )\) holds. \(\blacksquare \)

By defining the partially-complete monitoring predicate \(\textsf {cmon}(m,\varphi )\) of Definition 2 and the sound monitoring predicate of Definition 1, we are now in a position to formalise our touchstone notion for monitor-formula correspondence.

Definition 3

(Monitor-Formula Correspondence ). A monitor \(m\) is said to monitor for a formula \(\varphi \), denoted as \(\textsf {mon}(m,\varphi )\), if it can do it soundly, and in a partially-complete manner:

$$\begin{aligned} \textsf {mon}(m,\varphi )&\;\mathop {=}\limits ^{{\mathrm {def}}}\; \textsf {smon}(m,\varphi ) \;\text {and}\; \textsf {cmon}(m,\varphi ) \end{aligned}$$

\(\blacksquare \)

Example 8

Consider the monitor \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) and the basic formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \). One can verify that \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) monitors for \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \), i.e., \(\textsf {mon}(\alpha .{{\mathbf {\mathsf{{{yes}}}}}},\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} )\). Intuitively, this is because every system that \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) accepts must generate the trace \(\alpha \), which is precisely the evidence required to show that the system satisfies \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \). From this information, we can deduce that both requirements of Definition 3, i.e., \(\textsf {smon}(\alpha .{{\mathbf {\mathsf{{{yes}}}}}},\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} )\) and \(\textsf {cmon}(\alpha .{{\mathbf {\mathsf{{{yes}}}}}},\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} )\), hold.

Consider now the same monitor compared to the formula \(\langle \alpha \rangle \langle \beta \rangle {{\mathbf {\mathsf{{tt}}}}} \). According to Definition 3, \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) does not monitor for \(\langle \alpha \rangle \langle \beta \rangle {{\mathbf {\mathsf{{tt}}}}} \). We can show this via the counterexample system \(\alpha .{{{\mathbf {\mathsf{{0}}}}}}\): it is not hard to see that the assertion \(\textsf {acc}(\alpha .{{{\mathbf {\mathsf{{0}}}}}},\alpha .{{\mathbf {\mathsf{{{yes}}}}}})\) holds, but the system being monitored for does not satisfy the formula, i.e., \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \not \in \llbracket \langle \alpha \rangle \langle \beta \rangle {{\mathbf {\mathsf{{tt}}}}} \rrbracket \). This makes the monitor \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) unsound, i.e., \(\lnot \textsf {smon}(\alpha .{{\mathbf {\mathsf{{{yes}}}}}},\langle \alpha \rangle \langle \beta \rangle {{\mathbf {\mathsf{{tt}}}}} )\) from Definition 1.

By Definition 3, the monitor \(\alpha .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{no}}}}}}\) monitors for \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \wedge [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) by rejecting all the systems that violate the property. This is the case because a system violates \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \wedge [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) if and only if it exhibits either trace \(\alpha \) or trace \(\beta \). Such systems are precisely those that are rejected by the monitor \(\alpha .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{no}}}}}}\). \(\blacksquare \)

Definition 3 describes the relationship between monitors and logical formulae from the monitor’s perspective. Dually, monitorability is an attribute of a logical formula, describing its ability to be adequately analysed at runtime by a monitor. The concept of monitorability can also be lifted to sets of formulae (i.e., a sublogic). In this sense, the definition of monitorability is dependent on the monitoring set-up assumed (i.e., the semantics of Fig. 5) and the conditions that constitute an adequate runtime analysis. In what follows, we discuss the monitorability of our logic \(\mu \mathrm {\textsc {HML}}\) assuming Definition 3 as our base notion for adequate monitoring.

Definition 4

(Monitorability). A formula \(\varphi \in \mu \mathrm {\textsc {HML}} \) is monitorable iff there exists a monitor \(m \) such that \(\textsf {mon}(m,\varphi )\). A set of formulae \(\mathcal {L}\subseteq \mu \mathrm {\textsc {HML}} \) is monitorable iff every formula in the set, \(\varphi \in \mathcal {L}\), is monitorable. \(\blacksquare \)

Showing that a set of formulae is monitorable is, in general, non-trivial since proving Definition 4 entails two universal quantifications: one for all the formulae in the formula set, and another one for all the systems of the LTS over which the semantics of the formulae is defined. In both cases, the respective sets may be infinite. We immediately note that not all logical formulae in \(\mu \mathrm {\textsc {HML}}\) are monitorable. The following example substantiates this claim.

Example 9

Through the witness monitor \(\alpha .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{no}}}}}}\) discussed in Example 8, we can show that formula \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \wedge [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) is monitorable with a violation-complete monitor. By contrast, the variant formula \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) (swapping the conjunction with a disjunction operator) is not. This can be shown by arguing towards a contradiction. Assume that there exists some monitor \(m\) such that \(\textsf {mon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\). From Definition 3 we know that \(\textsf {smon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\) and \(\textsf {cmon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\) hold, and by Definition 2, there are two subcases to consider for \(\textsf {cmon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\):

  • If \(m\) is satisfaction-complete, \(\textsf {scmon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\), then \(\textsf {acc}(\beta .{{{\mathbf {\mathsf{{0}}}}}},m)\) for the specific system \(\beta .\) \({{\mathbf {\mathsf{{0}}}}}\) since \(\beta .{{{\mathbf {\mathsf{{0}}}}}} \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \). By the semantics of the logic in Fig. 3 we have \(\beta .{{{\mathbf {\mathsf{{0}}}}}} \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \). From the acceptance \(\textsf {acc}(\beta .{{{\mathbf {\mathsf{{0}}}}}},m)\), we know that \(m\) must either be able to reach a satisfaction, either autonomously, \(m \!\overset{}{\Longrightarrow }\!{\mathbf {\mathsf{{{yes}}}}} \), or after observing action \(\beta \), \(m \!\overset{\beta }{\Longrightarrow }\!{\mathbf {\mathsf{{{yes}}}}} \). For both cases we can argue that \(m\) also accepts the system \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}}\), since there exists a trace that leads the monitored system \(m \triangleleft \, \alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}}\) to an acceptance verdict. This is unsound (Definition 1) since \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}} \not \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \), contradicting \(\textsf {smon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\).

  • If \(m\) is violation-complete, \(\textsf {vcmon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\), then, for the specific system \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}}\), we must have \(\textsf {rej}(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}},m)\) since, by the logic semantics, we know \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}} \not \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \). Now, from the structure of \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{0}}}}}}\), we can deduce that \(m \) can reach verdict \({{\mathbf {\mathsf{{no}}}}}\) along one of the traces \(\epsilon , \alpha \) or \(\beta \):

    • If it is the empty trace \(\epsilon \), then \(m\) must also reject the system \({{\mathbf {\mathsf{{0}}}}}\). However \({{\mathbf {\mathsf{{0}}}}}\in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \) since \({{\mathbf {\mathsf{{0}}}}}\) cannot produce any action; this makes the monitor unsound, contradicting \(\textsf {smon}(m,[\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} )\).

    • If the trace is \(\alpha \), \(m\) must also reject the system \(\alpha .\) \({{\mathbf {\mathsf{{0}}}}}\) along the same trace \(\alpha \). This also makes the monitor unsound: from \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \in \llbracket [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \) and the semantics of Fig. 3 we deduce \(\alpha .{{{\mathbf {\mathsf{{0}}}}}} \in \llbracket [\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \rrbracket \). The case for \(\beta \) is analogous. \(\blacksquare \)

Definition 5

(Monitorable Logic). Let \(\mathrm {m\textsc {HML}} = \mathrm {c\textsc {HML}} \cup \mathrm {s\textsc {HML}} \), where

$$\begin{aligned} \pi ,\varpi \in \mathrm {c\textsc {HML}} \,\,{:}{:=}&\;{{\mathbf {\mathsf{{tt}}}}}&|\; {{\mathbf {\mathsf{{ff}}}}}&|\;\pi \vee \varpi&|\;\langle \alpha \rangle \pi&|\;{\mathbf {\mathsf{{{min}}}}}\,X.\pi&|\; X \\ \theta ,\vartheta \in \mathrm {s\textsc {HML}} \,\,{:}{:=}&\;{{\mathbf {\mathsf{{tt}}}}}&|\; {{\mathbf {\mathsf{{ff}}}}}&|\;\theta \wedge \vartheta&|\; [\alpha ]\theta&|\;{\mathbf {\mathsf{{{max}}}}}\,X.\theta&|\; X \end{aligned}$$

\(\blacksquare \)

In [25, 26], the syntactic subset \(\mathrm {m\textsc {HML}}\) of Definition 5 is shown to be monitorable. At an intuitive level, \(\mathrm {m\textsc {HML}}\) consists of the co-safe and safe syntactic subsets of \(\mu \mathrm {\textsc {HML}}\), resp. labelled as \(\mathrm {c\textsc {HML}}\) and \(\mathrm {s\textsc {HML}}\). The logical subset \(\mathrm {c\textsc {HML}}\) describes properties whose satisfying systems can provide a witness trace that enables the monitor to conclusively determine that they are included in the property. Dually, \(\mathrm {s\textsc {HML}}\) captures properties whose systems are unable to provide a single witness trace that permits the monitor to conclude that they violate the property. Note that, for both \(\mathrm {c\textsc {HML}}\) and \(\mathrm {s\textsc {HML}}\), any extension of a witness trace used to reach a verdict is a witness trace itself.

Example 10

Recall formulae \(\varphi _1\), \(\varphi _2\), \(\varphi _5\) and \(\varphi _6\) from Example 2. We can establish that these are indeed monitorable in the sense of Definition 4 by performing a simple syntactic check against the grammar in Definition 5. This avoids complicated semantic reasoning that is usually harder to automate, such as that shown in Examples 8 and 9. \(\blacksquare \)

The work in [25, 26] goes even further, and shows that the syntactic subset identified in Definition 5 is maximally expressive w.r.t. the monitorability of Definition 4. This means that all the properties that are monitorable can be expressed in terms of the syntactic fragment described in Definition 5. We are unaware of any other maximality results for monitorability in the field of RV.

Example 11

The formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \wedge \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) is not part of the monitorable syntactic fragment of Definition 5, as the \(\mathrm {c\textsc {HML}}\) syntax prohibits possibility modalities (i.e., \(\langle \alpha \rangle \)) from being combined using conjunctions. However, it turns out that the property denoted by the formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \wedge \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) is indeed monitorable because it can be monitored for by the monitor \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}}\) (see Definition 3). In fact, \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \wedge \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) is semantically equivalent to the formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) which is, in turn, included in the syntactic fragment of Definition 5. More generally, the apparently restrictive \(\mathrm {m\textsc {HML}}\) fragment of Definition 5 still allows us to describe all the monitorable properties expressible in \(\mu \mathrm {\textsc {HML}}\). \(\blacksquare \)

5 The Rocky Error Picture Show

A tenet of the basic RV set-up depicted in Fig. 1 is that the monitor used in the configuration is itself correct. Yet, it is perilous to assume that monitors are immune to errors, for erroneous monitors pose a number of risks. To begin with, they could invalidate the runtime analysis performed, resulting in wasted computational overhead (irrespective of how low this might be). Even worse, erroneous monitors may jeopardise the execution of the SUS itself, and a system that originally satisfies a correctness specification ends up violating the same specification after it is instrumented with a monitor. Even though these risks could prove to be as detrimental as that of having high monitoring overheads, few monitors come equipped with correctness assurances. In many cases, is it even unclear what these assurances should be, giving rise to discrepancies between the expected monitor behaviour and the actual monitor implementation.

A formal development such as the monitorability formulation in Sect. 4 may help towards mitigating this. For instance, a synthesis function for the sublogic in Definition 5 was given as a by-product of the monitorability proofs in [25, 26]; this synthesis function was shown to generate monitors that satisfy the correspondence requirements of Definition 3. However, these assurances (which mainly focus on the expressiveness of monitors) may not suffice for certain applications. To illustrate, the monitor \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}} \,\textsf {+}\, \alpha .{{{\mathbf {\mathsf{{end}}}}}}\) adequately monitors for the formula \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) according to Definition 3. Yet, it is not hard to see that this does not always yield an acceptance verdict when the SUS produces the witness trace \(\alpha \). In this respect, the monitor \(\alpha .\) \({\mathbf {\mathsf{{{yes}}}}}\) mentioned earlier in Example 8 may be seen as a better, or even, a more correct implementation than \(\alpha .{{\mathbf {\mathsf{{{yes}}}}}} \,\textsf {+}\, \alpha .{{{\mathbf {\mathsf{{end}}}}}}\) that monitors for \(\langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \).

The work in [23] studies a possible basis for comparing monitors, that is independent of the specification language used. It develops a number of preorders of the form \(m _1 \sqsubseteq m _2\). Intuitively, these denote the fact that, when instrumented with any arbitrary system \(p\), if the monitored system \(m _1 \triangleleft \, p \) exhibits certain characteristics, then \(m _2 \triangleleft \, p \) exhibits them as well. For different monitoring characteristics, one obtains different preorders. Such preorders may be used in a variety of ways. They may be employed as notions of refinement, where \(m _1\) represents a monitor specification whose behaviour is preserved by the concrete implementation \(m _2\). Preorders may also be used to determine when one monitor implementation can be safely substituted with another, without affecting the existing characteristics of a monitoring set-up. Importantly, however, they provide a foundation for understanding monitor errors whenever the relation \(m _1 \sqsubseteq m _2\) does not hold.

We review the salient aspects of this work in the context of our foundational theory for monitors. To simplify the material that follows, we restrict ourselves to monitors that only reach rejections. This allows us to side-step issues related to soundness (discussed earlier in Sect. 4) and focus on orthogonal behavioural aspects. Note that our preference for rejections over acceptances is arbitrary. We begin by capturing the (complete) execution of a monitored system, which may be described in our setting via the notion of a computation, defined below. In Definition 6 the trailing \(\tau \)-transitions permit the monitor to stabilise and reach a verdict after a number of internal computational steps.

Definition 6

The transition sequence with trace \(t\), \( m \triangleleft \, p \!\overset{t}{\Longrightarrow }\! m _0 \triangleleft \, p _0 \xrightarrow {\;\tau \;} m _1 \triangleleft \, p _1 \xrightarrow {\;\tau \;} m _2 \triangleleft \, p _2 \xrightarrow {\;\tau \;} \ldots \), is called a \(t\)-computation if it is maximal (i.e., either it is infinite or it is finite and cannot be extended further using \(\tau \)-transitions). A \(t\)-computation is called rejected (or a rejected computation along \(t\)) iff there exists some \(n\in \mathbb {N}\) in the transition sequence where \(m _n = {{\mathbf {\mathsf{{no}}}}} \). \(\blacksquare \)

Following Definition 3, a criterion for comparing monitors considers the possible verdicts that may be reached after observing a specific execution trace produced by the SUS. In this sense, a monitor is as good as another monitor if it can match all of the rejected computations of the other monitor.

Definition 7

A monitor \(m\) potentially-rejects system \(p\) along trace \(t\), denoted as \(\textsf {pr}(m,p,t)\), iff there exists a rejecting \(t\)-computation from \(m \triangleleft \, p \). Monitor \(m _2\) is as good as \(m _1\) w.r.t. potential rejections, denoted as \(m _1 \sqsubseteq _\textsf {pr} m _2\), iff

$$\begin{aligned} for\,\, all\,\, systems\,\, p,\,\, and\,\, all\,\, traces\,\, t,\,\, \textsf {pr}(m _1,p,t)\,\, implies\,\, \textsf {pr}(m _2,p,t). \end{aligned}$$

The preorder induces the expected kernel equivalence \({m _1 \!\cong _\textsf {pr} m _2 \mathop {=}\limits ^{{\mathrm {def}}}}\) \({(m _1 \sqsubseteq _\textsf {pr} m _2\hbox { and } m _2 \!\sqsubseteq _\textsf {pr} m _1)}\). We write \(m _1 \!\sqsubset _\textsf {pr} m _2\) in lieu of \((m _1 \!\sqsubseteq _\textsf {pr} m _2 \text { and } m _2 \!\not \sqsubseteq _\textsf {pr} m _1)\). \(\blacksquare \)

Example 12

Consider the following monitor descriptions:

$$\begin{aligned} m _1&= \alpha .{\beta .{{{\mathbf {\mathsf{{no}}}}}}} \;\,&m _2&= \alpha .{{{\mathbf {\mathsf{{no}}}}}} \;\,&m _3&= \alpha .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{no}}}}}} \;\,&m _4&= \alpha .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{no}}}}}} \,\textsf {+}\, \beta .{{{\mathbf {\mathsf{{end}}}}}} \end{aligned}$$

We have \(m _1\sqsubset _\textsf {pr} m _2\) since, for any \(p\), any rejected \(t\)-computation of \(m _1\) (which must have prefix \(\alpha \beta \)) is also rejected by \(m _2\), but the inverse is not: for some \(p \!\overset{\alpha }{\longrightarrow }\!p '\) we have \(\textsf {pr}(m _2,p,\alpha )\) but not \(\textsf {pr}(m _1,p,\alpha )\). For similar reasons, we have

$$\begin{aligned}&m _2\;\sqsubset _\textsf {pr} \; m _3\;\cong _\textsf {pr} \; m _4\end{aligned}$$

Observe that \(m _3\) and \(m _4\) are considered to be potential-rejection equivalent. \(\blacksquare \)

Potential rejections may be too weak for certain mission critical applications. For instance, although \(m _4\) can reject traces that start with \(\beta \), it does not mean that it will, because it may (non-deterministically) follow the branch \(\beta .\) \({{\mathbf {\mathsf{{end}}}}}\) that does not lead to a rejection. An alternative criterion would thus be to compare monitors w.r.t. their deterministic rejections.

Definition 8

A monitor \(m\) deterministically-rejects system \(p\) along trace \(t\), denoted as \(\textsf {dr}(m,p,t)\), iff all \(t\)-computations from \(m \triangleleft \, p \) are rejecting. Monitor \(m _2\) is as good as \(m _1\) w.r.t. deterministic rejections, denoted as \(m _1 \sqsubseteq _\textsf {dr} m _2\), iff

$$\begin{aligned} for\,\, all\,\, systems\,\, p,\,\, and\,\, all\,\, traces\,\, t,\,\, \textsf {dr}(m _1,p,t)\,\, implies \,\,\textsf {dr}(m _2,p,t). \end{aligned}$$

The respective kernel equivalence, \(m _1 \cong _\textsf {dr} m _2\), and irreflexive preorder, \(m _1 \sqsubset _\textsf {dr} m _2\), induced by deterministic rejections are as expected. \(\blacksquare \)

Example 13

We have the following following relationships w.r.t. deterministic rejections for the monitors \(m _1\) to \(m _4\) from Example 12:

$$\begin{aligned}&m _1\;\sqsubset _\textsf {dr} \; m _2\;\cong _\textsf {dr} \; m _4\;\sqsubset _\textsf {dr} \; m _3\end{aligned}$$

We note that, whereas in Example 12, \(m _4\) was considered to be strictly better than \(m _2\) w.r.t. potential rejections, it is considered equivalent w.r.t. deterministic rejections, since the rejections of \(m _4\) for traces commencing with a \(\beta \) action are not deterministic and thus ignored. \(\blacksquare \)

It is worth mentioning that defining the preorders of Definitions 7 and 8 in terms of instrumented system executions (instead of just considering the respective monitor traces in isolation) reveals subtleties that would otherwise be missed. These would nevertheless manifest themselves at runtime.

Example 14

Consider the construct \(\tau .{m}\), describing a monitor that performs an internal computation \(\tau \) before behaving like \(m \). Using the syntax of Figure 5, this may be encoded as \({{\mathbf {\mathsf{{rec}}}}}\,x.m \), where x is not a free variable in the continuation \(m\) (see rule Rec). We have the following relations between a monitor that immediately rejects (\({{\mathbf {\mathsf{{no}}}}}\)), and another that performs some internal computation before yielding reject (\(\tau .\) \({{\mathbf {\mathsf{{no}}}}}\)):

$$\begin{aligned}&{{\mathbf {\mathsf{{no}}}}} \;\cong _\textsf {pr} \; \tau .{{{\mathbf {\mathsf{{no}}}}}}&\text { but }&\tau .{{{\mathbf {\mathsf{{no}}}}}} \;\sqsubset _\textsf {dr} \; {{\mathbf {\mathsf{{no}}}}} \end{aligned}$$

It is not hard to see why the monitor \({{\mathbf {\mathsf{{no}}}}}\) is a top element for both preorders \(\sqsubseteq _\textsf {pr}\) and \(\sqsubseteq _\textsf {dr}\), since it immediately rejects all traces for any given system \(p\) (i.e., for any \(m\), we have \(m \sqsubseteq _\textsf {pr} {{\mathbf {\mathsf{{no}}}}} \) and \(m \sqsubseteq _\textsf {dr} {{\mathbf {\mathsf{{no}}}}} \)). The monitor \(\tau .\) \({{\mathbf {\mathsf{{no}}}}}\) exhibits the exact behaviour w.r.t. potential rejections and is thus a top element for \(\sqsubseteq _\textsf {pr}\) as well. Interestingly, \(\tau .\) \({{\mathbf {\mathsf{{no}}}}}\) is not a top element for \(\sqsubseteq _\textsf {dr}\) however. Consider, as a counter example, the system \(p\) that goes into an infinite (internal) loop \( p \!\overset{\tau }{\longrightarrow }\!p \!\overset{\tau }{\longrightarrow }\!\ldots \). When instrumented with the monitor \(\tau .\) \({{\mathbf {\mathsf{{no}}}}}\), the monitored system can exhibit the \(\epsilon \)-computation \(\tau .{{{\mathbf {\mathsf{{no}}}}}} \triangleleft \, p \!\overset{\tau }{\longrightarrow }\!\tau .{{{\mathbf {\mathsf{{no}}}}}} \triangleleft \, p \!\overset{\tau }{\longrightarrow }\!\tau .{{{\mathbf {\mathsf{{no}}}}}} \triangleleft \, p \!\overset{\tau }{\longrightarrow }\!\ldots \) effectively starving the monitor \(\tau .\) \({{\mathbf {\mathsf{{no}}}}}\), and preventing it from ever reaching its rejection verdict, \({{\mathbf {\mathsf{{no}}}}}\). Thus, we do not have \(\textsf {dr}(\tau .{{{\mathbf {\mathsf{{no}}}}}},p,\epsilon )\) and, as a result, the inequality \({{\mathbf {\mathsf{{no}}}}} \sqsubseteq _\textsf {dr} \tau .{{{\mathbf {\mathsf{{no}}}}}}\) does not hold. \(\blacksquare \)

Formulating Definitions 7 and 8 in terms of instrumented system executions also facilitates the classification of anomalous monitor behaviour.

Example 15

Consider the unresponsive monitor \(m _\omega \), that goes into an infinite loop without exhibiting any other behaviour (i.e., \(m _\omega \!\overset{\tau }{\longrightarrow }\!m _\omega \!\overset{\tau }{\longrightarrow }\!\ldots \)). Whereas for the potentially-rejecting preorder, this monitor is clearly a bottom element (i.e., for any \(m\) we have \( m _\omega \sqsubseteq _\textsf {pr} m \)) it is, perhaps surprisingly, not a bottom element for the deterministic-rejection preorder. In fact, according to Definition 8, we obtain seemingly peculiar orderings such as the one below (using monitor \(m _2\) defined earlier in Example 12):

$$\begin{aligned}&m _2\;\sqsubset _\textsf {dr} \; m _\omega \;\sqsubset _\textsf {dr} \; {{\mathbf {\mathsf{{no}}}}} \end{aligned}$$

Note that, according to the instrumentation semantics of Fig. 5, \(m _\omega \) prevents any system that is composed with it from generating observable events: for arbitrary \(p\), whenever \(p \!\overset{\alpha }{\longrightarrow }\!p '\) given some \(\alpha \), rule Mon cannot be applied (since \(m _\omega \not \!\xrightarrow {\;\alpha \;}\)) and neither can rule Ter (since \(m _\omega \xrightarrow {\;\tau \;}\)). This means that \(\textsf {dr}(m _\omega ,p,t)\) holds for any system \(p\) and trace \(t\) that is not empty (i.e., \(t\ne \epsilon \)). This condition holds trivially because no such trace exists, thus explaining why \(m _2\sqsubset _\textsf {dr} m _\omega \). The only case where \(\textsf {dr}(m _\omega ,p,t)\) does not hold is precisely when \(t=\epsilon \), leaving us with \(m _\omega \sqsubset _\textsf {dr} {{\mathbf {\mathsf{{no}}}}} \). \(\blacksquare \)

Using the two preorders \(\sqsubseteq _\textsf {pr} \) and \(\sqsubseteq _\textsf {dr} \), we can define a third (more refined) preorder that is an intersection of the two presented thus far, taking into consideration both of their respective criteria when comparing monitors.

Definition 9

A monitor \(m _2\) is as good as a monitor \(m _1\) w.r.t. rejections, denoted as \(m _1 \sqsubseteq _\textsf {rej} m _2\), iff \(m _1 \sqsubseteq _\textsf {pr} m _2\) and \(m _1 \sqsubseteq _\textsf {dr} m _2\). \(\blacksquare \)

Whenever we have \(m _1 \sqsubseteq _\textsf {rej} m _2\), we can substitute \(m _1\) with \(m _2\) safe in the knowledge that, all potential and deterministic rejections made by \(m _1\) are preserved by \(m _2\) irrespective of the system these are composed with. Alternatively, should we consider missed rejections as our notion of error, then whenever \(m _1 \not \sqsubseteq _\textsf {rej} m _2\) we know that when instrumented with a common system, \(m _2\) may not exhibit all the rejections that \(m _1\) produces.

Example 16

We obtain the following strict ordering w.r.t. Definition 9:

$$\begin{aligned} {}{m _1\;\sqsubset _\textsf {rej} \; m _2\;\sqsubset _\textsf {rej} \; m _4\;\sqsubset _\textsf {rej} \; m _3}\quad \blacksquare \end{aligned}$$

In [23], additional monitor preorders are considered based on other correctness criteria. For instance, monitors are compared w.r.t. to their capability of interfering with the execution of a system, as outlined briefly in Example 15. Apart from devising correctness preorders, the aforementioned work also develops sound compositional techniques that can check for preorder inequalities without relying on universal quantifications on systems. Although this is essential for any automated checking of said preorder inequalities, such compositional techniques are outside the scope of this presentation. Interested readers should nevertheless consult [23] for more details.

6 Monitorability and Correctness Now Redux

The concepts outlined in Sects. 4 and 5 served as a basis for various other work related to RV, monitors and monitorability. We here discuss a subset of this work.

Tools. In [6], the monitorability results presented in Sect. 4 (and their respective proofs reported in [26]) were used for the construction of a monitoring tool called detectEr. In this tool, monitorable correctness properties for reactive Erlang programs can be specified using the monitorable syntactic subset of Definition 5. From these logical specifications, detectEr automatically synthesises monitors that are then instrumented to execute alongside an Erlang SUS, reporting satisfaction or violation verdicts depending on the behaviour exhibited. The need for a more comprehensive instrumentation mechanism for the target language (Erlang), resulted in the development of an aspect-oriented utility called eAOP [15]. This tool is incorporated into the detectEr toolchain, and its development and features (e.g. its pointcut definitions) are largely guided by the RV requirements for detectEr. Nevertheless, eAOP can also be used as a standalone utility for other purposes such as Erlang code refactoring.

Fig. 7.
figure 7

Monitoring components C\(_1\) and C\(_2\) and inferring their other plausible traces.

Monitorability. The work of [7] considers a slightly different monitoring setup to that presented in Fig. 1. Interestingly, it shows how the maximality results for monitorability of Sect. 4 can be extended when working with the new setup. This work views the SUS as a collection of components (instead of treating it as one monolithic block) whereby certian trace events can be attributed to specific components, as shown in Fig. 7. The monitor is equipped with this static information about the SUS to permit it to extend its analytical capabilities. In the case of [7], certain components are known to be execution-independent to one another, meaning that from a particular trace interleaving observed in the current execution trace, the monitor could infer additional (plausible) traces that can also be used for verification purposes.

For instance, in Example 9 we earlier argued why the property \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) in the basic setup of Fig. 1 is not monitorable. The formula states that a SUS satisfies the property if it either cannot perform \(\alpha \) or it cannot perform \(\beta \). A trace can however only provide us with enough evidence to deduce that only one of the execution subbranches is violated, not both. Yet, if we know that actions \(\alpha \) and \(\beta \) can be produced exclusively by independently-executing components \(C_1\) and \(C_2\) resp., from a witness trace of the form \(\alpha \beta \ldots \), a monitor in the setup of Fig. 7 could infer that the trace \(\beta \alpha \ldots \) can also be generated by the system (for a different execution interleaving of components \(C_1\) and \(C_2\)). This allows the monitor to analyse two execution traces instead of one: the witness and inferred traces can then be used together to obtain enough evidence to be able to deduce a violation of the property \([\alpha ]{{\mathbf {\mathsf{{ff}}}}} \vee [\beta ]{{\mathbf {\mathsf{{ff}}}}} \), making it monitorable.

In other work [22], the authors show how the monitorable subset of Definition 5 can be used to devise a strategy that apportions the verification process between the pre-deployment (via standard techniques such as Model Checking) and post-deployment (using RV) phases of software development. To illustrate, consider the property \(\langle \alpha \rangle ([\beta ]{{\mathbf {\mathsf{{ff}}}}} \vee \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} ) \) which clearly does not belong to the monitorable subset of Definition 5. Reformulating it into its semantic equivalent, \(\langle \alpha \rangle [\beta ]{{\mathbf {\mathsf{{ff}}}}} \vee \langle \alpha \rangle \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \), allows us to isolate the runtime-monitorable subformula \(\langle \alpha \rangle \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \). One could then check for subformula \(\langle \alpha \rangle [\beta ]{{\mathbf {\mathsf{{ff}}}}} \) prior to deployment, and in cases where this is not satisfied, monitor for the \(\langle \alpha \rangle \langle \alpha \rangle {{\mathbf {\mathsf{{tt}}}}} \) at runtime.

Monitor Correctness. In [1], the authors consider the use of standard determinisation techniques borrowed from automata theory to determinise monitors and their respective verdicts. The approach would enable a tool to take (non-deterministic) monitor descriptions such as \(\alpha .{\beta .{{\mathbf {\mathsf{{{yes}}}}}}} \,\textsf {+}\, \alpha .{\gamma .{{\mathbf {\mathsf{{{yes}}}}}}}\) (e.g. produced by the detectEr tool of [6] when synthesising monitors from the monitorable formula \(\langle \alpha \rangle \langle \beta \rangle {{\mathbf {\mathsf{{tt}}}}} \vee \langle \alpha \rangle \langle \gamma \rangle {{\mathbf {\mathsf{{tt}}}}} \)) and obtain the monitor \(\alpha .\)(\(\beta .{{\mathbf {\mathsf{{{yes}}}}}} \,\textsf {+}\, \gamma .{{\mathbf {\mathsf{{{yes}}}}}}\)) instead. The key contribution of this work is the establishment of complexity upper bounds (more or less known) and, more interestingly, complexity lower bounds whereby converting a non-deterministic monitor of n states to a deterministic one requires a space complexity of at least \(2^{2^{\varOmega \sqrt{n \,\text {log}\, n}}}\).

A fully deterministic monitor behaviour (where every event analysed leads exclusively to one possible monitor state) is too rigid in practice to describe the behaviour of certain monitor implementations. A case in point is monitoring that is conducted via a number of concurrent submonitors [27] that may be subject to different thread interleaving every time they are executed. The monitor framework discussed in Sect. 5 served as a basis for the formulation of a definition for consistently-detecting monitors [24] that allows degrees of non-deterministic behaviour as long as these are not externally observable via inconsistent detections. The work in [24] borrows from the compositionality techniques studied in [23] to define an alternative coinductive definition for consistent monitor behaviour based on the notion of controllability [31]. It also develops symbolic techniques that facilitate the construction of analysis tools for determining monitor controllability.

Language Design. The work presented in [14] uses the safety fragment \(\mathrm {s\textsc {HML}}\) from Definition 5 as a basis for defining a scripting language for describing adaptation procedures in a monitor-oriented programming paradigm. It specifically targets systems that are constructed in a layered fashion, with the outer layers adding degrees of functionality in response to the behaviour observed from the inner layers. In this setup, the monitors produced are not passive in the sense that they merely analyse behaviour, but actively engage with the observed system via adaptations so as to change its behaviour and mitigate system faults. Despite of their benefits, runtime adaptations induced by monitors may introduce certain errors themselves. For this reason, in [13] the authors also define language-based methods in the form of a type system to assist monitor construction and to statically detect erroneous adaptations declared in the monitor scripts at design time.

7 Conclusion

In this paper we surveyed a body of work that strives to establish a formal foundation for the RV technique. It focusses on addressing two main questions, namely monitorability for system specifications and monitor correctness. Although the account concentrated mainly on unifying the work presented in [23, 25, 26], it also showed how this preliminary work has served as a basis for subsequent developments on the subject, such as [1, 7, 13, 14, 24]. We are currently working on extending these results even further, by investigating monitorability for diverse monitoring set-ups along the lines of [7], and by considering monitors with augmented capabilities such as those that perform property enforcement [36].