1 Introduction

Keeping systems secure against attacks and preventing security incidents are challenging tasks due to the increasing complexity of modern system architectures, where a number of hardware and software components communicate over potentially heterogenous networks. To analyze systems which are too complex to be fully described monolithically, abstraction employing formal methods plays a key role and it has been studied in particular in the computer science literature (see, e.g., Baier and Katoen 2008; Kang et al. 2016). In networked systems, components with different architectures cooperate with each other using various pre-designed protocols. Due to the proliferation of communication using standardized protocols, vulnerabilities or misuses of protocols can result in serious security issues. As a concrete example, Bagheri et al. (2015) introduces a formal model and analysis of a protocol used in Android OS, one of the most popular operating systems for smart phones. In order for components to cooperate with each other without damaging systems and without data corruption, robustness of protocols against communication failures is essential in modern system architectures. To ensure such robustness of protocols, relevant properties, such as safety and liveness, should be satisfied even if packets are dropped for instance. However, the situation is different in the context of malicious attacks, where an attacker that has infiltrated part of the system (e.g., the network) may be able to induce a violation of the safety or liveness properties, thereby causing the protocol to enter an abnormal state.

The development of resilient protocols that satisfy requirements and are applicable to various systems requires formal methods for modelling, verification, and synthesis. These problems have a long history in computer science as well as in control engineering. The readers are referred to Baier and Katoen (2008) and Holzmann and Lieberman (1991) for a comprehensive treatment of modelling and verification by employing formal methods, such as temporal logic. To prevent systems from being damaged by attacks that exploit vulnerabilities of protocols, the recent work (Alur and Tripakis 2017) introduces the process of completing an incompletely specified protocol so that the completed protocol satisfies required properties and does not suffer from deadlock. Alur and Tripakis (2017) explains its methodology of protocol completion using the Alternating Bit Protocol (ABP).

In control engineering, the formalism of of discrete event systems (DES) (Cassandras and Lafortune 2021) and its supervisory control theory (SCT) (Wonham and Cai 2019) are useful tools to treat the problem of protocol verification as a supervisory control problem (Rudie and Wonham 1992), so as to determine whether a given protocol satisfies the required properties. Not only can SCT be used to analyze existing protocols, it can also be used to synthesize a desired protocol based on given requirements. For instance, Kumar et al. (1997) introduces a systematic approach to design a protocol converter for mismatched protocols so that the specifications of the entire system and protocols themselves are satisfied simultaneously. On the other hand, Rudie and Wonham (1990) considers protocols comprising local communicating processes, and formalizes protocol synthesis as the problem of controlling the local processes so that the global specification of the entire system is satisfied, employing the decentralized version of SCT. For a comprehensive survey of protocol synthesis, focusing on the formalization of the design of protocols, the readers are referred to Saleh (1996).

More generally, detection, mitigation, and prevention of attacks on supervisory control systems within the framework of SCT has been considered in several works, such as Carvalho et al. (2018), Wakaiki et al. (2019), Su (2018), and Meira-Góes et al. (2019). Carvalho et al. (2018) presents a methodology of designing intrusion detectors to mitigate online four types of attacks; actuator enablement/disablement and sensor erasure/insertion. Focusing on sensor deception attacks under which the attacker arbitrarily edits sensor readings by intervening between the target system and its control module to trick the supervisor to issue improper control commands, Wakaiki et al. (2019) and Su (2018) study how to synthesize robust supervisors against sensor deception attacks, while (Su 2018) also introduces the synthesis problem of attack strategies from the attacker’s point of view. Subsequently, a different technique from Su (2018) to compute a solution of the synthesis problem of robust supervisors was proposed in Meira-Góes et al. (2019).

As protection against attacks is one of the main subjects of systems security, methodologies for designing attack strategies against systems have been reported in the literature (Meira-Góes et al. 2020; Lin et al. 2019; Von Hippel et al. 2020a). Meira-Góes et al. (2020) presents how to synthesize an attacker in the context of stealthy deception attacks, modelled in the framework of SCT, which cannot be detected by the supervisor and cause damage to the system, as a counter weapon against intrusion detection modules as in Carvalho et al. (2018). While (Meira-Góes et al. 2020) considers sensor deception attacks as the attacker’s weapon, Lin et al. (2019) introduces the synthesis of actuator attacks under which the attacker has the ability to hijack the control commands generated by the supervisor, to damage the system.

Formal synthesis of successful attacks against protocols is the problem considered in this paper, in the context of two case studies. The work in Von Hippel et al. (2020a) (and its conference version (Von Hippel et al. 2020b)) is of special relevance, as it introduces a methodology of attacker synthesis against systems whose components are modelled as finite-state automata (FSA). It presents how so-called “There-exists” attackers can be found (if they exist) using a formal methodology that has been implemented in the software tool Korg (Von Hippel 2020). In the terminology of Von Hippel et al. (2020a), “There-exists” refers to attackers that cannot always lead protocols to a violation of required properties, but sometimes succeed (“there exists” a winning run for the attacker). (Von Hippel et al. 2020a) formulates the properties that protocols must protect against as threat models, and it illustrates its methodology with the Transmission Control Protocol (TCP), specifically connnection establishment using three-way handshake, as standardized in Postel (1981). The formal model in Von Hippel et al. (2020a) was inspired by that in Jero et al. (2015) where automated attack discovery for TCP is performed using a state-machine-informed search.

In this paper, we revisit the respective ABP and TCP models of Alur and Tripakis (2017) and Von Hippel et al. (2020a) in the standard framework of DES modelled as FSA. In contrast to the feedback-loop control system architecture in the previously-mentioned works on sensor/actuator deception attacks in SCT, we consider a network system architecture in which two peers are sending and receiving packets through channels and/or networks, as explained in Section 3. We consider “person-in-the-middle” (PITM) attacks as in Von Hippel et al. (2020a) and Jero et al. (2015), in a manner reminiscent of deception attacks. Inspired by and complementary to the approach in Von Hippel et al. (2020a), we exploit results in SCT and develop a methodology to synthesize “For-all” attackers, that is, attackers that can always eventually cause a violation of required properties of the system, extending the previous work by Von Hippel et al. (2020a) on There-exists attackers. Section 4 will present the details of our methodology, and will state our main results as Theorem 1. We then apply this methodology to both ABP and TCP, using essentially the same models as in Alur and Tripakis (2017) and Von Hippel et al. (2020a). Thus, our results extend those in Von Hippel et al. (2020a) by formally considering the synthesis of “For-all” attackers on TCP, since For-all attacks are more powerful than There-exists attacks. In both of our case studies, we approach attack synthesis as a supervisory control problem under partial observation from the attacker’s viewpoint, which is then solved using existing techniques (Cassandras and Lafortune 2021; Wonham and Cai 2019). As specifically discussed in Section 4.3, under the assumptions of our PITM attack model, a “For-all” attacker for a given threat model is obtained by building the realization of the (partial-observation) nonblocking supervisor that results in the supremal controllable and normal sublanguage (supCN) of the threat model language with respect to the system language and to the attacker’s controllable and observable event sets. The supCN operation was first introduced in Cho and Marcus (1989), and several formulas to compute supCN were derived in Brandt et al. (1990). For each of the two protocols ABP and TCP, respectively in Sections 5 and 6, we analyze several setups capturing different PITM attacker capabilities.

The detailed case studies presented in this paper, based upon established models of ABP and TCP (three-way handshake part), show the various steps on how to build, in a systematic manner, successful PITM attacks (if they exist) on these two well-know protocols. We believe they can also serve as inspiration for similar case studies on other protocols.

The remainder of this paper is organized as follows. Section 2 provides a brief review of the DES framework and its Supervisory Control Theory employed in this paper. In Section 3, we introduce the context on modelling of communication protocols and give an overview of the PITM attack model under consideration, which is based on specifying a safety or nonblockingness property that the attacker is intent on violating in the context of SCT. Section 4 formulates the SCT-based synthesis problem of a For-all attacker (if it exists) and presents the features of the common methodology that is used in the subsequent sections on ABP and TCP, respectively. ABP is considered first in Section 5, and then TCP is considered in Section 6. Both sections contain sufficient details so that these case studies can be replicated. Finally, we conclude the paper in Section 7.

2 Preliminaries

In this section, we introduce several notions of the DES framework in Cassandras and Lafortune (2021), leveraged to build our models in this paper. The central definitions we need here are automata, nonblockingness of automata, parallel composition, supervisory control theory and nonblocking supervisor.

In DES, what happens in the system is explained by sequences of predefined events which discretely occur. Specifically, the system’s behaviour is represented as a set of sequences of events, called a language, and each sequence is called a string. Namely, a language is a set of strings. Note that strings could be arbitrary long and languages could be infinite sets.

One of the intuitive methods to represent (regular) languages is finite state automata (FSA), or simply automata, represented as a quintuple

$$G = (X, E, f, x_{0}, X_{m})$$
(1)

where X is the finite set of states, E is the finite set of events, f : X × EX is the (partial) transition function, x0 is the initial state and \(X_{m} \subseteq X\) is the set of marked states. The function f denotes the system’s behaviour as state transitions defined in the automaton G, e.g., f(x,e) = x represents a transition labelled by event eE from state xX to state \(x^{\prime } \in X\). From Eq. 1, the connection between languages and automata is formally defined as the generated language \({\mathcal{L}}(G) := \{s \in E^{*} \mid f(x_0, s) \text { is defined}\}\).

From the perspective of system control, it makes sense to consider that several behaviours of the system are acceptable or desired. We call the strings denoting acceptable behaviours marked strings, and the language consisting of marked strings is called a marked language. To represent the marked language associated with G, the marked states in Xm come to play that role. Mathematically, the language marked by G is defined by \({\mathcal{L}}_{m}(G) := \{s \in {\mathcal{L}}(G) \mid f(x_{0}, s) \in X_{m}\}\). However, depending on the structure of G, it may not be guaranteed that the system G can always eventually reach its marked states. In particular, the existence of deadlock and livelock in G can prevent the marked states from being reached. Such a property in DES is called nonblockingness. Specifically, G is said to be blocking if \(\overline {{\mathcal{L}}_{m}(G)} \subset {\mathcal{L}}(G)\) and nonblocking if \(\overline {{\mathcal{L}}_{m}(G)} = {\mathcal{L}}(G)\). In other words, if G is blocking, then there exists deadlock or livelock in G, that is, there exists a state from which the marked states cannot be reached, and vice versa.

In many cases, the systems we analyze consist of several subcomponents, or one may want to examine at once the entire behaviour of multiple system models. The DES framework has an operation of automata called parallel composition to build models of entire systems from subsystem models. For example, the parallel composition \(G^{\prime }\) of system G1 and system G2 is denoted by \(G^{\prime } = G_{1} \parallel G_{2}\). Roughly speaking, a common event in G1 and G2 can only occur in \(G^{\prime }\) if both G1 and G2 execute it simultaneously. The private (unshared) events, on the other hand, can be executed in \(G^{\prime }\) whenever feasible in either G1 or G2. For the detailed definition and properties of parallel composition, readers are referred to (Cassandras and Lafortune 2021, pp. 81–87).

Considering that the given systems do not always follow their specifications, supervisory control is a concept to control the systems represented as DES, and its mathematical framework is called supervisory control theory (SCT), which is to synthesize a controller attached to the system so that the given specifications are satisfied. In the framework of SCT in DES, a system to be controlled is called a plant, and a plant is controlled by a supervisor that enables or disables particular (controllable) events so that the plant satisfies a given specification for safety or nonblockingness for instance. The control actions of the supervisor are determined by observation of the strings generated by the plant; thus the plant and supervisor from a feedback loop as depicted in Fig. 1.

Fig. 1
figure 1

The feedback loop of supervisory control

Technically speaking, a supervisor S is defined as a function

$$S: \mathcal{L}(G) \to 2^{E}$$
(2)

which takes a string generated by G and returns a set of events permitted to occur in G. In other words, S(s) is a control action for a string \(s \in {\mathcal{L}}(G)\). Note that supervisor S is prohibited from disabling a feasible uncontrollable event at any state. Namely, letting \(E_{uc} \subseteq E\) be a set of uncontrollable events in G, for each \(s \in {\mathcal{L}}(G)\), it always holds that \(E_{uc} \cap \{e \in E \mid f(f(x_0, s), e) \text { is defined}\} \subseteq S(s)\).

In the framework of SCT, it is also considered that the supervisor has a limited observability of events generated by the plant. This limitation is represented by partitioning the set of events E into two disjoint subsets: the sets of observable events Eo and of unobservable events Euo, namely E = EoEuo. To implement this property, the supervisor in Eq. 2 is extended to the partial-observation supervisorSP defined by

$$S_{P}: P[\mathcal{L}(G)] \to 2^{E}$$
(3)

where P is the natural projection from domain E to codomain \(E_{o}^{*}\), removing unobservable events from a string generated by G. Note that in this scheme, the control action by SP is supposed to always take effect before any unobservable event occurs.

Given G and SP, the closed-loop behaviour of G controlled by SP is denoted by a DES SP/G, formalized in the following definition.

Definition 1 (Languages generated and marked by S P/G)

(cf. (Cassandras and Lafortune 2021, p. 151)) The generated language \({\mathcal{L}}(S_{P}/G)\) is recursively defined as

  1. 1.

    \(\varepsilon \in {\mathcal{L}}(S_{P}/G)\)

  2. 2.

    \([s \in {\mathcal{L}}(S_{P}/G) \land s\sigma \in {\mathcal{L}}(G) \land \sigma \in S_{P}[P(s)]] \iff [s\sigma \in {\mathcal{L}}(S_{P}/G)]\)

and the marked language \({\mathcal{L}}_{m}(S_{P}/G)\) is defined as

$$\mathcal{L}_{m}(S_{P}/G) := \mathcal{L}(S_{P}/G) \cap \mathcal{L}_{m}(G).$$
(4)

We can also examine the blockingness of SP/G as a meaningful characteristic of the controlled system. Similarly to the blockingness of G, the DES SP/G is said to be blocking if \({\mathcal{L}}(S_{P}/G) \neq \overline {{\mathcal{L}}_{m}(S_{P}/G)}\) and blocking if \({\mathcal{L}}(S_{P}/G) = \overline {{\mathcal{L}}_{m}(S_{P}/G)}\). Since these properties depend on the synthesis result of SP, SP is said to be blocking if SP/G is blocking and to be nonblocking if SP/G is nonblocking.

The specification that the plant should obey is given as a specification language \(L^{spec} \subseteq {\mathcal{L}}(G)\), or its automaton representation H such that \({\mathcal{L}}_{m}(H) = L^{spec}\). It is an important point that Lspec may not be \({\mathcal{L}}_{m}(G)\)-closed, namely \(L^{spec} \neq \overline {L^{spec}} \cap {\mathcal{L}}_{m}(G)\), and we may want the supervisor SP to “mark” strings in \({\mathcal{L}}(S_{P}/G)\) based on Lspec, rather than \({\mathcal{L}}_{m}(G)\). Therefore, the SCT framework provides an alternative version of SP, called a marking supervisor, defined as

$$\mathcal{L}_{m}(S_{P}/G) := \mathcal{L}(S_{P}/G) \cap L^{spec}.$$
(5)

For the technical details of marking supervisors, the readers are referred to Section 3.9 in Cassandras and Lafortune (2021). In the rest of this paper, nonblockingness of SP will be defined by either Eqs. 4 or 5, depending on the properties of the considered specification Lspec (namely, Lspec being \({\mathcal{L}}_{m}(G)\)-closed or not).

3 System and attack models

Before proceeding to the specific ABP and TCP protocols, we highlight in this section and in the next one the common elements of our two case studies.

3.1 System architecture

When modelling communication protocols such as ABP and TCP, we consider a “system” that consists of peers communicating with each other, channels, and networks. For clarity of presentation, we suppose the system comprises two peers, two or four channels, and one network. If peers form a small network using channels, e.g., a local area network (LAN), then networks can be omitted and we consider two channels connecting each peer, namely, the forward and backward channels.

Fig. 2
figure 2

Communication overview

Figure 2 illustrates an overview of the flow of packets between two peers through channels. Peers A and B exchange packets using communication protocols through the channels and network. In this paper, we consider “person-in-the-middle” (PITM) as the attack model on the system. In this model, the attacker infiltrates the network or channels, and afterwards sends fake packets and/or discards genuine ones, exploiting vulnerabilities of the protocol (as captured by the peer automata), to damage the system. The system may contain other processes for exogenous events, e.g., timers, called environment processes, which are not depicted in Fig. 2. Channels work as interfaces between the peers and the network, relaying packets to their destinations. Each component of the system is modelled by a finite-state automaton, and denoted as follows:

GPA: Peer A; GPB: Peer B; GC: Channel; GN: Network; and Ge: Environment processes.

Each channel is represented by one finite-state automaton, thus GC is the parallel composition of the channel automata. For example, if the system architecture is that in Fig. 2a, then GC = GC1GC2GC3GC4 where GCi (i = [1,4]) are the respective automata modelling each channel. If the system architecture is that in Fig. 2b, then GC = GFCGBC where GFC and GBC are the forward and backward channels, respectively, and GN is empty since there is no network in such an architecture. In the case where there exist more than two environment processes in the system, Ge is also constructed as the parallel composition of all environment processes.

To capture PITM attacks on the above system, we create new versions of the channels and network automata when they are infiltrated by the attacker and denote them by GC,a and GN,a, respectively. We consider that the attacker cannot directly tamper the internal codes of peers in our model of PITM attacks, meaning that the attacker cannot disable nor enable the private events of the peers. Instead, in the infiltrated channels or network, the attacker intercepts packets and can delete them, and can also insert new packets to impersonate the sender or receiver, as similarly considered in Jero et al. (2015). Thus, we construct GC,a and GN,a by the addition of new transitions and events that represent the feasible actions of the attacker, as the addition can capture insertion and replacement of packets, and packet deletion by the attacker can be captured by disabling transitions which indicate packet transfer. Concrete examples of GC,a and GN,a will be presented in the case studies in Sections 5 and 6.

Let us define a nominal system model (i.e., without attacker) by

$$G_{nom} := (X_{nom}, E_{nom}, f_{nom}, x_{nom,0}, X_{nom,m}).$$
(6)

Gnom is the parallel composition of the peers, channels, network, and environment processes, namely

$$G_{nom} = G_{PA} \parallel G_{PB} \parallel G_{C} \parallel G_{N} \parallel G_{e}$$
(7)

As we consider PITM attacks on the system, we enhance Gnom to the new model of the system under attack

$$G_{a} := (X_{a}, E_{a}, f_{a}, x_{a,0}, X_{a,m})$$
(8)

where possible new transitions and events representing the actions of the attacker come from the enhanced GC,a and GN,a automata described above. The other compoents of Gnom, namely the peer automata GPA and GPB, as well as Ge, remain unchanged. In our case studies, the plant Ga is acted upon by the attacker; hence, the plant consists of the entire system under attack:

$$G_{a} = G_{PA} \parallel G_{PB} \parallel G_{C,a} \parallel G_{N,a} \parallel G_{e}$$

The sending and receiving of packets are represented by events. As we consider PITM attacks, it is reasonable to assume that an attacker infiltrating the network or channels can only monitor incoming and outgoing packets at the infiltrated component. In other words, the attacker cannot observe the private events of the peers. Therefore, we consider that the events in our system model are partitioned into observable events and unobservable events, based on the system structure and the capability of the attacker. It is also natural to assume that the attacker cannot prevent the peers from sending packets to the network or channels, although the attacker can discard their packets. That is, the attacker cannot control the receiving of packets by the network or channels.

Example 1

Let us consider PITM attacks on the Alternating Bit Protocol (ABP). ABP is a protocol which defines the communication mechanism between two peers depicted in Fig. 2b. Each peer sends and receives packets from its counterpart through the forward and backward channels using first-in-first-out (FIFO) semantics. Inspired by (Alur and Tripakis 2017), we consider Gnom as the parallel composition of the following 7 automata.

  • GS = (XSESfSxS,0XS,m): ABP sender

  • GR = (XRERfRxR,0XR,m): ABP receiver

  • GFC = (XFCEFCfFCxFC,0XFC,m): Forward channel

  • GBC = (XBCEBCfBCxBC,0XBC,m): Backward channel

  • GSC = (XSCESCfSCxSC,0XSC,m): Sending client

  • GRC = (XRCERCfRCxRC,0XRC,m): Receiving client

  • GT = (XTETfTxT,0XT,m): Timer

Therefore, we have

$$G_{nom} = G_{S} \parallel G_{R} \parallel G_{FC} \parallel G_{BC} \parallel G_{SC} \parallel G_{RC} \parallel G_{T}$$
(9)

We also consider that Peer A first sends packets to Peer B, and afterwards Peer B sends an acknowledgement to Peer A. Since Peer A plays a role of the sender side and Peer B is at the receiver side, GPA = GS, GPB = GR, GC = GFCGBC, and Ge = GSCGRCGT, thus Eq. 9 reduces to Eq. 7. Note that GN in Eq. 7 will be empty in this case.

The various event sets are defined as follows, where synchronization in || will be achieved by common events:

$$\begin{array}{@{}rcl@{}} E_{S} & =& \{send, done, timeout, p_{0}, p_{1}, a_{0}^{\prime}, a_{1}^{\prime}\} \end{array}$$
(10)
$$\begin{array}{@{}rcl@{}} E_{R} & =& \{deliver, p_{0}^{\prime}, p_{1}^{\prime}, a_{0}, a_{1}\} \end{array}$$
(11)
$$\begin{array}{@{}rcl@{}} E_{FC} & =& \{p_{0}, p_{1}, p_{0}^{\prime}, p_{1}^{\prime}\} \end{array}$$
(12)
$$\begin{array}{@{}rcl@{}} E_{BC} & =& \{a_{0}, a_{1}, a_{0}^{\prime}, a_{1}^{\prime}\} \end{array}$$
(13)
$$\begin{array}{@{}rcl@{}} E_{SC} & =& \{send, done\} \end{array}$$
(14)
$$\begin{array}{@{}rcl@{}} E_{RC} & =& \{deliver\} \end{array}$$
(15)
$$\begin{array}{@{}rcl@{}} E_{T} & =& \{timeout\} \end{array}$$
(16)

Hence

$$\begin{array}{@{}rcl@{}} E_{nom} & =& E_{S} \cup E_{R} \cup E_{FC} \cup E_{BC} \cup E_{SC} \cup E_{RC} \cup E_{T} \end{array}$$
(17)
$$\begin{array}{@{}rcl@{}} & =& \{send, done, timeout, deliver, p_{0}, p_{1}, p_{0}^{\prime}, p_{1}^{\prime}, a_{0}, a_{1}, a_{0}^{\prime}, a_{1}^{\prime}\}. \end{array}$$
(18)

The events with prefix “p” indicate that a packet with indicator bit “0” or “1” has been sent from the ABP sender to the ABP receiver (i.e., from Peer A to Peer B), and prefix “a” indicates an acknowledgement sent from the ABP receiver to the ABP sender, corresponding to which “0” or “1” has been received by the ABP receiver. The prime symbol is attached to the events of packets and acknowledgement to distinguish those before going through the channel from the corresponding ones after the channels, as is done in Alur and Tripakis (2017).

Figure 4 shows the models of the ABP components. GS and GR are example solutions of the distributed protocol completion problem in Alur and Tripakis (2017). Note that we have removed from the models in Fig. 4 “dead” transitions which are never executed by the system when the attacker is not present. The terminology “dead” is from (Alur and Tripakis 2017). In addition, we mark all the states of the ABP components, for reasons that will become clear later. Namely,

$$X_{S,m} = X_{S},\ X_{R,m} = X_{R},\ X_{FC,m} = X_{FC}, \ X_{BC,m} = X_{BC},\ X_{SC,m} = X_{SC},\ X_{RC,m} = X_{RC}, \ X_{T,m} = X_{T}$$

In (Alur and Tripakis 2017), the forward and backward channels are modelled as nondeterministic finite-state automata as shown in Fig. 4b and d. That nondeterminism is introduced to model nonadversarial errors in communication channels, such as packet drop and duplication (see Section 4.2 in Alur and Tripakis (2017)). To construct the system model in Eq. 9, we need deterministic finite-state automata as factors of the parallel composition. Thus, we construct GFC and GBC as observer automata of \(G_{FC}^{nd}\) and \(G_{BC}^{nd}\), depicted in Fig. 5:

$$\begin{array}{@{}rcl@{}} G_{FC} & =& Obs(G_{FC}^{nd}) \end{array}$$
(19)
$$\begin{array}{@{}rcl@{}} G_{BC} & =& Obs(G_{BC}^{nd}) \end{array}$$
(20)

where “observers” are as defined in Cassandras and Lafortune (2021) and capture the standard conversion of a nondeterministic automaton to a deterministic one (often referred to as subset construction). Observe that GFC and GBC generate exactly the same languages as \(G_{FC}^{nd}\) and \(G_{BC}^{nd}\), respectively.

Let us consider one example case of PITM attacks where a powerful attacker infiltrates the forward channel. To construct the plant under attack Ga capturing the attacker’s actions, we enhance \(G_{FC}^{nd}\) to \(G_{FC,a}^{nd}\) as depicted in Fig. 8a by adding the new transitions shown as the red arrows. This enhanced channel model represents the attacker’s capability that can send packets to the recipient with whichever bit 0 or 1, regardless of the incoming packets from the sender. Letting \(G_{FC,a} = Obs(G_{FC,a}^{nd})\) in the same way as Eq. 19, Ga is hereby given by

$$G_{a} = G_{S} \parallel G_{R} \parallel G_{FC,a} \parallel G_{BC} \parallel G_{e}$$
(21)

Section 5 describes in detail the procedure to model the PITM attack against ABP.

In our case studies, Ga is the plant and the attacker plays a role of the supervisor; in this context, the specification represents what damage the attacker wants to cause to the system. In other words, the specification should capture violations of a desired property of the communication protocol, such as absence of deadlock or proper delivery of packets. Therefore, using SCT to synthesize a supervisor that enforces the violation of a desired property of the communication protocol under consideration means that we have actually synthesized an attack strategy that indeed causes a violation of that property.

3.2 For-all attack

One of the contributions of this paper as compared to previous work is that we consider that the attacker wants to attack the system in a “For-all” manner, to be interpreted in the following sense: the attacker can always eventually cause a violation of the given property. Such specifications are naturally captured in SCT using the notion of marked states and nonblockingness. When the marked states capture the violation of the given property, then a nonblocking supervisory in SCT will exactly achieve the goal of a For-all attacker, since it will always be possible to eventually reach a marked state. Specifically, consider an attacker’s marked (i.e., non-prefix-closed) specification language \(L_{a}^{spec} \subset {\mathcal{L}}(G_{a})\) which consists of strings that are illegal but feasible in the system under attack. Let Sa be a supervisor (aka attacker) for Ga that achieves as much of \(L_{a}^{spec}\) as possible in the controlled system Sa/Ga. We denote this marked language by K, namely, \(K \subseteq L_{a}^{spec}\) and the attacker wants K to be as large as possible. In order to achieve a For-all attack, the attacker wants Sa to be nonblocking, namely, \({\mathcal{L}}_{m}(S_{a}/G_{a}) = K\) and \({\mathcal{L}}(S_{a}/G_{a}) = \overline {K}\). Thus, nonblockingness of the system under attack implies that the attacker can always eventually win; thus, we have indeed obtained a For-all attack strategy. This is how For-all attacks are defined in this paper.

The above definition of For-all attacks is formalized in Definition 2.

Definition 2 (For-all Attack-Supervisor)

Given \(L_{a}^{spec} \subset {\mathcal{L}}(G)\), let \(K \subseteq L_{a}^{spec}\) be a nonempty sublanguage. Sa is said to be a For-all attack-supervisor with respect to Ga and K if

  1. 1.

    \({\mathcal{L}}_{m}(S_{a}/G_{a}) = K\); and

  2. 2.

    \({\mathcal{L}}(S_{a}/G_{a}) = \overline {K}\).

3.3 There-exists attack

If there exists a supervisor Sa not satisfying the condition in Definition 2 but \({\mathcal{L}}(S_{a}/G_{a}) \cap K \neq \varnothing\), then we say that such an Sa achieves a There-exists attack, because in that case the controlled system (under the actions of the attacker) Sa/Ga will contain deadlocks and/or livelocks (i.e., the system under attack is blocking in the terminology of SCT); this prohibits the attacker from always being able to eventually win. Still, the nonemptyness of \({\mathcal{L}}_{m}(S_{a}/G_{a})\) means that the attacker can sometimes win. This is how There-exists attacks are defined in this paper.

The above definition of There-exists attacks is formalized in Definition 3.

Definition 3 (There-exists Attack-Supervisor)

Given \(L_{a}^{spec} \subset {\mathcal{L}}(G)\), let \(K \subseteq L_{a}^{spec}\) be a nonempty sublanguage. Sa is said to be a There-exists attack-supervisor with respect to Ga and K if

  1. 1.

    \({\mathcal{L}}(S_{a}/G_{a}) \cap K \neq \varnothing\); and

  2. 2.

    Sa is not a For-all attack-supervisor.

Now that we have shown how to build the plant model Ga, we address in the next section the construction of an automaton representation for the (non-prefix-closed) language \(L_{a}^{spec}\), which will be the “specification automaton” for the attacker that is needed in the context of SCT algorithmic procedures.

Remark 1

In the prior work (Von Hippel et al. 2020a), mostly analogous definitions of There-exists and For-all attackers are given, but in the framework of reactive synthesis with infinite strings and temporal logic (LTL) specifications (see Definition 6 in Von Hippel et al. (2020a)). The technical difference comes from requiring “can always eventually win” instead of requiring “will always eventually win” (as is typically done in LTL and is done in Von Hippel et al. (2020a)). The latter is expressible in LTL, but not the former. The reactive synthesis setting is formally compared to that of SCT in Ehlers et al. (2017), where it is shown that nonblockingness in SCT is not expressible in LTL but instead corresponds to “AGEF(marked)” in CTL. In this paper, since we use SCT, we match the notion of “AGEF(marked)”, i.e., “can always eventually win”. Moreover, since we are working in the context of SCT, we will use the term “nonblockingness” for the class of “liveness” properties that will be considered in this paper.

4 Procedure for synthesis of For-all attacks on communication protocols

In this section, we discuss the modelling procedure to construct a specification automaton for the attacker based on the considered properties (instances of safety or nonblockingness) of the communication protocol that are to be violated by actions of the attacker. Then, we formulate the problem of finding For-all feasible attacks on the system as a supervisor synthesis problem in SCT which has a known solution. The SCT-based methodology presented in this section will be applied to ABP and TCP in the next two sections.

4.1 Safety properties

As in Alur and Tripakis (2017), consider a safety property whose violation is modelled by an automaton, termed a safety monitor Gsm. Gsm captures the violation of the given safety property in terms of illegal states in its structure. Since the specification for attackers represents a violation of the property, the illegal states are represented by marked states in Gsm. In other words, Gsm captures the violation of the safety property of interest when it reaches its marked states. (Note that in our problem context, we do require marked states to capture violation of safety properties.)

Gsm can be derived from automata composing Gnom, namely, the peers, channels, or network, by modifying state marking for instance. One can also independently design Gsm as a new automaton that we call a dedicated automaton in this paper. Both instances will occur in our case studies. For example, in Section 5, the safety monitors Gsm for ABP are given as dedicated automata in Fig. 6. Let Gother be the parallel composition of the automata in Gnom which are not used to construct Gsm. For example, from Eq. 7, if Gsm is built by modifying GPAGPB, then Gother = GCGNGe. In Section 6, we will construct Gsm for the TCP case study using TCP peer models GPA and GPB in Fig. 16 later on.

The specification automaton will in our case studies be the parallel composition of Gother and Gsm, as is commonly done in SCT. Letting Hnom be the specification automaton with respect to Gnom (system without attacker), we have that Hnom = GotherGsm. Note that since we want marking in Hnom to be determined by marking in Gsm, all the states of Gother are to be marked. In the absence of attackers, the communication protocol should ensure the safety property under consideration, which means that its violation should never occur. This can be verified by confirming that Hnom has no reachable marked states, i.e., Hnom captures no violations of the given safety property with respect to Gnom.

To represent the specification automaton with respect to the system under attack, namely Ga, we construct Gother,a based on Ga in the same manner as Gother. For instance, if Gsm is a dedicated automaton and the attacker infiltrates the network, then Gother,a = GPAGPBGCGN,aGe. Let Ha = Gother,aGsm be the specification automaton under attack. Similarly to marking in Gother, we want Gsm to determine marking in Ha, thus all the states of Gother,a are to be marked. If there exist no marked states in Ha, then the attacker is not powerful enough to cause a violation of the safety property. Even if Ha has marked states, there may not exist For-all attacks (but possibly only There-exists attacks), depending on whether a nonblocking supervisor can be synthesized with respect to plant Ga and specification automaton Ha; this will be addressed in the solution of the SCT problem discussed below.

In summary, the procedure to build Ha for a given safety property is presented in Algorithm 1.

Algorithm 1
figure c

Attack Specification against Safety (SafeSpec).

Proposition 1

Suppose that \(Y_{nom,m} = \varnothing\) in Algorithm 1, that is, the given system model is correct in terms of the safety properties. If Ya,m on line 16 is empty, then no For-all attack exists and no There-exists attack exists.

Proof

By construction, Gsm captures a violation of the given safety property by reaching its marked states. Let Xother,a and Xsm be the sets of states of Gother,a and Gsm, respectively. Note that \(Y_{a} \subseteq X_{other,a} \times X_{sm}\) from line 15 of Algorithm 1. Since all the states in Xother,a are marked, it holds that \(Y_{a,m} = \varnothing\) iff for every (xother,axsm) ∈ Ya, xsm is not marked. This means that the safety monitor Gsm never captures the violation iff Ha has no marked states. In other words, the attacker can never cause a violation of the given safety property. Therefore, if \(Y_{a,m} = \varnothing\), then no For-all attack exists and no There-exists attack exists.

We build several instances of Ha for ABP in Section 5.4 and for TCP in Section 6.5. The safety monitors for ABP are given as dedicated automata in Fig. 6, as will be explained in Section 5.1, while those for TCP are derived from Ga based on the given safety property, as will be explained in Section 6.2.

4.2 Nonblockingness properties

We examine a “limited” liveness property, called nonblockingness, as expressible in SCT for ∗-languages, namely, languages of finite strings. Nonblockingness is an adequate tool in many applications, such as in software systems; see, e.g.: deadlock in database concurrency control (Lafortune 1988); deadlock in multithreaded programs (Gadara project) (Liao et al. 2013). Since our approach is based on SCT, nonblockingness is the only type of liveness property that we consider in our case studies. Thus, the set of marked states used for nonblockingness will be the “parameter” that captures the desired instance of liveness. In our setting, in For-all attacks the attacker wants to cause a violation of nonblockingness with respect to the chosen marked states. First of all, Gnom in Eq. 7 should be trim for correctness of the system without attacker, as otherwise Gnom would contain deadlocks or livelocks. However, Ga should not be trim, meaning that the system under attack should contain deadlock or livelock states, i.e., be blocking.

As for the case of safety monitors previously considered, in several instances the violation of the nonblockingness property of interest will be modelled using a dedicated automaton, the nonblockingness monitor Gnm; one such example is shown in Fig. 7, inspired by Alur and Tripakis (2017) and considered in in Section 5.2. The marked states of Gnm will record the violation of the given nonblockingness property.

On the other hand, if Gnm is not given a priori, then violations of nonblockingness will be captured as follows: starting from Ga, unmark all states and mark instead the desired (from the viewpoint of the attacker) deadlock and livelock states in Ga, resulting in a suitable Gnm model. This is done because deadlock and livelock states are illegal, and the attacker wants the system to reach those illegal states (some or all of them, depending on the type of attack). This is the approach that we will follow in our case study on TCP, as will be explained in Section 6.5.3 and 6.5.4.

Next, we construct Gother,a in the same way as in Section 4.1. That is, we model Gother,a as the parallel composition of the automata in Ga which are not used to build Gnm, and ensure that all the states in Gother,a are marked. Note that if Gnm is not given as a dedicated automaton and we derive Gnm from Ga, then Gother,a is empty.

Finally, we define Ha = Trim(Gother,aGnm), to represent the specification for the attacker which leads the plant to deadlock or livelock states. As a result, we introduce the algorithm to construct Ha in the case of the nonblockingness properties in Algorithm 2.

Algorithm 2
figure d

Attack Specification against Nonblockingness (NonblockSpec).

Proposition 2

Suppose that Gnom is trim in Algorithm 2, that is, the given system model is correct in terms of the nonblockingness properties. If Ya,m on line 19 is empty, then no For-all attack exists and no There-exists attack exists.

Proof

The proof can be done in the same manner as of Proposition 1, replacing Gsm by Gnm.

We will discuss several instances of Ha for ABP in Section 5.4 and TCP in Section 6.5.

4.3 Problem formulation

In this section, we formulate the Attack-Supervisor Synthesis Problem (ASSP), which is an instance of a standard SCT partial-observation supervisory control problem, but where the attacker plays the role of “supervisor” and the specification is a violation of a given communication protocol property. ASSP is the formal statement of the For-all attack synthesis problem that is solved in our case studies on ABP and TCP.

Attacked-plant

As was described earlier, GC and/or GN are modified to represent the attacker’s ability of inserting and/or discarding packets, resulting in new automata denoted by GC,a and GN,a. Next, we form the plant Ga for ASSP as the parallel composition of nominal and infiltrated automata. For example, if the network is infiltrated by the attacker, then Ga = GPAGPBGCGN,aGe.

Attack specification

Next, we construct Ha using Algorithm 1 or Algorithm 2 based on the given safety or nonblockingness property to be violated, as discussed in Section 4.1 and Section 4.2. Since marking of states in Ha is determined by marking in Gsm or Gnm, the language marked by Ha, \({\mathcal{L}}_{m}(H_{a})\), represents strings where the attacker wins, because

  1. (i)

    These strings are feasible in Ga by construction.

  2. (ii)

    These strings lead the safety or nonblockingness monitor to a marked state.

As we discussed in Section 3.1, it is reasonable to assume that in PITM attacks the attacker cannot disable or enable the events in the nominal (non-infiltrated) automata, and also that the attacker only observes the events in the automata of the infiltrated components. Thus we define the two partitions of Ea in Eq. 8, from the viewpoint of the attacker (which plays the role of supervisor):

  1. (i)

    Controllable events Ea,c and uncontrollable events Ea,uc for controllability.

  2. (ii)

    Obsevable events Ea,o and unobservable events Ea,uo for observability.

Consequently, we have the following supervisory control problem, under partial observation, for the attacker.

Problem 1 (Attack-Supervisor Synthesis Problem, or ASSP)

Let Ga be a plant automaton, under attack, as in Eq. 8; Ea,c be a set of controllable events; Ea,o be a set of observable events; and \({\mathcal{L}}_{m}(H_{a}) \subset {\mathcal{L}}(G_{a})\) be a marked (non-prefix-closed) specification language. Find a maximal controllable and observable sublanguage of \({\mathcal{L}}_{m}(H_{a})\) with respect to \({\mathcal{L}}(G_{a})\), Ea,c, and Ea,o, if a non-empty one exists.

The following theorem states that a non-empty output of ASSP will be the controlled behaviour under a successful For-all attack, highlighting our main results in this paper.

Theorem 1

Let K be a solution of ASSP. Then there exists a For-all attack-supervisor with respect to Ga and K. Conversely, if ASSP has no non-empty solution, then there does not exist a For-all attack-supervisor for \(L^{spec}_{a} = {\mathcal{L}}_{m}(H_{a})\), with the given controllable and observable event sets for the attacker.

Proof

Since K is a controllable and observable sublanguage of \({\mathcal{L}}_{m}(H_{a}) \subset {\mathcal{L}}(G_{a})\), from the “controllability and observability theorem” (Cassandras and Lafortune 2021, p. 197), there exists a supervisor SP such that \({\mathcal{L}}_{m}(S_{P}/G_{a}) = K\) and \({\mathcal{L}}(S_{P}/G_{a}) = \overline {K}\). From Definition 2, SP here is a For-all attack-supervisor with respect to Ga and K. If \({\mathcal{L}}_{m}(H_{a})\) is not \({\mathcal{L}}_{m}(G)\)-closed, we consider SP to be a marking supervisor, as mentioned in Section 2. Conversely, if the empty set is the only solution to ASSP, then there is no For-all attacker: this is because there is no non-empty language satisfying conditions 1 and 2 in Definition 2.

The realization (using standard SCT terminology) of the corresponding (nonblocking) supervisor will encode the control actions of the attacker. By taking the parallel composition of the supervisor’s realization with the plant, we obtain an automaton that is language equivalent (generated and marked) to the plant under supervision. Namely, letting Ra be the realization of SP, it holds that RaGa is language equivalent to the controlled plant SP/Ga; see (Cassandras and Lafortune 2021; Wonham and Cai 2019). Ra therefore corresponds to a TM-attacker as defined in Von Hippel et al. (2020a). In ASSP, we require maximalty of the controllable and observable sublanguage, since this problem is known to be solvable (Yin and Lafortune 2015).

In the PITM attack model, the assumption of \(E_{a,c} \subseteq E_{a,o}\) usually holds. In fact, in all of the scenarios considered in Sections 5 and 6, the condition \(E_{a,c} \subseteq E_{a,o}\) will hold. In this important special case, the supremal controllable and observable sublanguage of \({\mathcal{L}}_{m}(H_{a})\) with respect to \({\mathcal{L}}(G_{a})\), Ea,c, and Ea,o exists and is equal to the supremal controllable and normal sublanguage of \({\mathcal{L}}_{m}(H_{a})\), denoted by \({{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}\), with respect to \({\mathcal{L}}(G_{a})\), Ea,c, and Ea,o. If it is empty, then no For-all attack exists for the given safety or nonblockingness property.

If \({{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}} \neq \varnothing\), then this language represents the largest attacked behaviour which is possible in the context of a For-all attack against the safety or nonblockingness property. Any marked string in that language provides an example of a successful attack, which is feasible in Ga and steers Gnm or Gsm to its marked (illegal) state. Let \(H_{a}^{CN}\) be the trim automaton output by the algorithm for the supremal controllable and normal sublanguage, namely

$$\mathcal{L}_{m}(H_{a}^{CN}) = {\mathcal{L}_{m}(H_{a})^{\uparrow CN}}$$
(22)

and

$$\mathcal{L}(H_{a}^{CN}) = \overline{{\mathcal{L}_{m}(H_{a})^{\uparrow CN}}}$$
(23)

From the controllability and observability theorem of SCT, there exists a partial-observation nonblocking supervisor SP such that

$$\mathcal{L}(S_{P}/G_{a}) = \overline{{\mathcal{L}_{m}(H_{a})^{\uparrow CN}}} = \mathcal{L}(H_{a}^{CN})$$
(24)

SP corresponds to a For-all attack-supervisor since every string in the controlled behaviour, SP/Ga, can be extended to a marked string, by nonblockingness of SP. In other words, it is always eventually possible for the system under attack by SP to violate the given property.

In the above formulation, \({\mathcal{L}}_{m}(H_{a})\) may not be \({\mathcal{L}}_{m}(G_{a})\)-closed, since it is possible that Ga = Gother,a and all the states in Ga are marked. Therefore, according to the use of Gsm and Gnm, whenever necessary we define SP as a marking supervisor by following (5), namely

$$\mathcal{L}_{m}(S_{P}/G_{a}) := \mathcal{L}(S_{P}/G_{a}) \cap \mathcal{L}_{m}(H_{a}^{CN}) = {\mathcal{L}_{m}(H_{a})^{\uparrow CN}}$$
(25)

As a last step, we need to build a realization of SP as an automaton that: (i) only changes its state upon the occurrence of observable events, since \(H_{a}^{CN}\) contains transitions with unobservable events; and (ii) whose active event set at each state of the realization is equal to the events enabled by the supervisor (attacker) at that state. Noting that marking of states may be relevant in the case of a marking supervisor, the standard process for automaton realization of a partial-observation supervisor (see Section 3.7.2 in Cassandras and Lafortune (2021)) can be followed. From Eqs. 24 and 25, we build an automaton realization of SP using \(H_{a}^{CN}\), where SP is such that

$$\mathcal{L}_{m}(S_{P}/G_{a}) = {\mathcal{L}_{m}(H_{a})^{\uparrow CN}}$$
(26)

and

$$\mathcal{L}(S_{P}/G_{a}) = \overline{{\mathcal{L}_{m}(H_{a})^{\uparrow CN}}}$$
(27)

First, we build the observer of \(H_{a}^{CN}\), \(Obs(H_{a}^{CN})\), with respect to Ea,o, using the standard process of observer construction (Cassandras and Lafortune 2021). Next, we add self loops for all events in Ea,cEa,uo that need to be enabled at each state of \(Obs(H_{a}^{CN})\), obtained by examining the corresponding states of \(H_{a}^{CN}\). The attack strategy of the successful For-all attacker is encoded in this realization, as desired.

Based on the above discussion, we introduce Algorithm 3 to synthesize For-all attacks with respect to the given Gnom, Ga and Gm (either a safety or nonblockingness monitor). We also state in Proposition 3 that Algorithm 3 returns the realization of a For-all attack-supervisor, if it exists, which encodes the attack strategy in order for the attacker to lead the plant to a violation of the given safety/nonblockingness monitor.

Algorithm 3
figure e

For-all attack synthesis.

Proposition 3

Suppose that Ha on line 2 or line 4 in Algorithm 3 is non-empty, i.e., Algorithm 1 or Algorithm 2 returns a non-empty solution. If ASSP (Problem 1) is solvable, then Algorithm 3 returns the realization of a For-all attack-supervisor.

Proof

Since \(E_{a,c} \subseteq E_{a,o}\), if there exists a solution of ASSP, then the supremal controllable and observable sublanguage of \({\mathcal{L}}_{m}(H_{a})\) exists and is equal to \({{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}\), which is a solution of ASSP. Thus from the proof of Theorem 1, a supervisor SP such that \({\mathcal{L}}_{m}(S_{P}/G_{a}) = {{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}\) and \({\mathcal{L}}(S_{P}/G_{a}) = \overline {{{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}}\) is a For-all attack-supervisor. Therefore, if ASSP is solvable, then Algorithm 3 returns the realization of a For-all attack-supervisor.

As long as Algorithm 3 returns a non-empty automaton, from Proposition 3, the above methodology results in a closed-loop system that produces For-all attacks, in the presence of the attacker. Since \(H_{a}^{CN}\) in Algorithm 3 is a trim automaton, we know that at any state in \(H_{a}^{CN}\), it is possible to reach a marked state, resulting in a violation of the monitor. Therefore, it is always possible for the attacker to eventually win.

Remark 2

When Ha output by Algorithm 1 or Algorithm 2 is not empty (i.e., when it has at least one marked state) but there is no For-all attack-supervisor (i.e., Algorithm 3 returns the empty solution), then we can conclude that there exists at least one There-exists attack-supervisor, according to Definition 3. For instance, one can take the supervisor Sall that always enables all events. Then \({\mathcal{L}}(S_{all}/G_{a}) = {\mathcal{L}}(G_{a})\) and \({\mathcal{L}}(G_{a}) \cap {\mathcal{L}}_{m}(H_{a}) = {\mathcal{L}}_{m}(H_{a})\) by construction of Ha. Hence, this attack-supervisor can reach any of the marked states in Ha where it “wins”, but the closed-loop system will be blocking. Techniques in SCT for synthesizing blocking supervisors, as described in Section 3.5.5 of (Cassandras and Lafortune 2021) for instance, can be employed to guide the design of There-exists attack-supervisors when no For-all attack-supervisor exists. Further investigation of There-exists attack-supervisors is beyond the scope of this paper.

5 ABP Case study

Our first case study for synthesis of For-all attacks is for the Alternating Bit Protocol (ABP), as studied and modelled in Alur and Tripakis (2017). The models of ABP components we use in this section are described in Example 1.

5.1 Safety property models

As introduced in Section 4.1, safety properties are represented by safety monitor automata which define what states in the system must not be reached, i.e., define illegal states. (Alur and Tripakis 2017) provides two safety monitor automata, \(G_{sm}^{1}\) and \(G_{sm}^{2}\), capturing the violation of safety properties for ABP, depicted in Fig. 6. The marked state q2 in \(G_{sm}^{1}\) and \(G_{sm}^{2}\) indicates the illegal state, namely, the safety property is violated if the monitor reaches this state from the initial state. \(G_{sm}^{1}\) expresses that:

  • deliver should happen after send, meaning that deliver of the ABP receiver and the Receiving client should not happen before the Sending client tells the ABP sender to send a bit to the forward channel.

  • After send happens, the next send should not occur before deliver occurs, meaning that the Sending client should wait for the acknowledgement signal from the ABP receiver.

On the other hand, \(G_{sm}^{2}\) expresses that:

  • done should happen after deliver, meaning that done of the ABP sender and the Sending client should not happen before the ABP receiver receives the signal and sends the acknowledgement to the ABP sender.

  • After deliver happens, the next deliver should not occur before done occurs, meaning that deliver cannot happen before the Sending client tells the ABP sender to send the next signal to the forward channel.

Since the safety monitors are provided as dedicated automata, \(G_{sm}^{1}\) and \(G_{sm}^{2}\), Gother in Algorithm 1 is equal to Gnom. In our ABP system model, Hnom on line 10 in Algorithm 1 has no marked states, thus we state that our ABP model is correct in terms of the safety properties. Namely, the nominal system (without attacker) does not violate the given safety properties.

5.2 Nonblockingness property models

The nonblockingness monitor in Fig. 7, Gnm, captures a violation of the nonblockingness property that the entire system should not get stuck, and should not keep invoking send. Namely, the first send should eventually be followed by a deliver. Gnm in Fig. 7 is a simplified version of a monitor provided by (Alur and Tripakis 2017) so that our nonblockingness monitor Gnm captures that the first transmission is never completed, which is adequate for our case study.

5.3 Attack model

As we consider the system architecture in Fig. 2b for ABP, the attacker infiltrates the forward and/or backward channels. To follow Algorithms 1 and 2, we first construct a modified model of the plant Ga in Eq. 8 under attack. Since the channels of ABP are under attack, we enhance GFC and GBC to those under attack, GFC,a and GBC,a, by adding new transitions to represent capabilities of the attacker. Note that if we keep either of the channels nominal, then GFC,a = GFC or GBC,a = GBC accordingly. Therefore, GC,a = GFC,aGBC,a.

The PITM attacker is represented by a modified forward or backward channel that can send the recipient a different packet from the incoming packet. For example, if the attacker has infiltrated the forward channel, then the attacker can send either \(p_{0}^{\prime }\) or \(p_{1}^{\prime }\) to the ABP receiver regardless of which p0 or p1 occurs. Figure 8 shows the attacked forward and backward channels. Red transitions are added to the original channel models in Fig. 4c and d. These new transitions enable the attacker to send whichever packet they want. To construct Ga, we model GFC,a and GBC,a as observer automata of \(G_{FC,a}^{nd}\) and \(G_{BC,a}^{nd}\), as was done for Gnom. Figure 9 depicts GFC,a and GBC,a, representing new transitions compared to Fig. 5 as red transitions.

As discussed in Section 3.1, we suppose that the attacker cannot control and observe events outside the channels. Therefore, the event set Ea is partitioned as follows:

  • Controllable events: \(E_{a,c} = \{p_{0}^{\prime }, p_{1}^{\prime }, a_{0}^{\prime }, a_{1}^{\prime }\}\)

  • Uncontrollable events: Ea,uc = {senddonetimeoutdeliverp0p1a0a1}

  • Observable events: \(E_{a,o} = \{p_{0}, p_{1}, p_{0}^{\prime }, p_{1}^{\prime }, a_{0}, a_{1}, a_{0}^{\prime }, a_{1}^{\prime }\}\)  

  • Unobservable events: Ea,uo = {senddonetimeoutdeliver}.

We consider that in our attack model, the attacker controls the output packets from the channels so that each safety or nonblockingness monitor in Sections 5.1 and 5.2 reaches its marked state, if possible.

5.4 Examination of the PITM attack for ABP

In this section, we examine the PITM attack for the above safety and nonblockingness properties of ABP according to the following steps:

  1. 1.

    Construct the plant under attack Ga as the parallel composition of the component models of ABP under attack, namely

    $$G_{a} = G_{S} \parallel G_{R} \parallel G_{C,a} \parallel G_{e}$$
    (28)

    where GC,a = GFC,aGBC,a and Ge = GSCGRCGT.

  2. 2.

    Using Algorithm 3, compute the realization of a For-all attack-supervisor with respect to Gnom, Ga and the safety/nonblockingness monitor for ABP.

For illustration purposes, if Algorithm 3 returns the realization of an attack-supervisor, we pick one example string from the initial state to one marked state in \({{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}\), which represents one system behaviour under attack that reaches a marked state in the monitor.

Ga varies depending on GC,a, namely which channel is under the PITM attack, so we consider the following three cases in each setup:

  1. 1.

    The forward channel is under the PITM attack (i.e. GBC,a = GBC):

    $$G_{a} = G_{S} \parallel G_{R} \parallel G_{FC,a} \parallel G_{BC} \parallel G_{e}$$
    (29)
  2. 2.

    The backward channel is under the PITM attack (i.e. GFC,a = GFC):

    $$G_{a} = G_{S} \parallel G_{R} \parallel G_{FC} \parallel G_{BC,a} \parallel G_{e}$$
    (30)
  3. 3.

    Both channels are under the PITM attack:

    $$G_{a} = G_{S} \parallel G_{R} \parallel G_{FC,a} \parallel G_{BC,a} \parallel G_{e}$$
    (31)

For clarity of presentation, we henceforth focus on the use of the safety monitor 1 and Ga in Eq. 29 in which the forward channel is under attack, as presented in Example 1. In other words, we consider Ha as the parallel composition of Ga in Eq. 29 and the safety monitor 1 \(G_{sm}^{1}\). The other cases of Eqs. 30 and 31 and the safety monitor 2 can be examined using the same procedure.

5.4.1 Attack against safety properties

Setup 1

Consider the PITM channels in Fig. 8 which represent a powerful attacker that can send packets to the recipient with whichever bit 0 or 1, regardless of the incoming packets.

Following our procedure, we found that Ha has 168 marked states out of 265 states and \(H_{a}^{CN}\) is non-empty. Here, \({\mathcal{L}}_{m}(H_{a}^{CN}) = {\mathcal{L}}_{m}(H_{a})\) and \({\mathcal{L}}(H_{a}^{CN}) = {\mathcal{L}}(G_{a})\), so Ha is already controllable and normal with respect to Ga, thus the attacker issues no disablement actions. Let us pick the example string \(send.p_{0}.p_{0}^{\prime }.deliver.a_{0}.p_{1}^{\prime }.deliver\), which means that the attacker sends the correct packet with bit 0 first, and afterwards sends a fake packet with bit 1 to the ABP receiver when it observes a0. In other words, the attacker inserts \(q_{1}^{\prime }\) soon after it observes a0. Consequently, \(G_{sm}^{1}\) captures the violation by reaching q2 with send.deliver.deliver.

Setup 2

Let us represent a less-powerful attacker by removing additional transitions from the PITM channels in Fig. 8. First, we remove all red transitions except \(p_{1}^{\prime }\) from f1 to f0 in Fig. 8a, so that the attacker can send packets with bit 1 at the particular timing. Let \(G_{FC,wa}^{nd}\) be the less powerful forward PITM channel derived from \(G_{FC,a}^{nd}\). Figure 10 shows \(G_{FC,wa}^{nd}\) and \(G_{FC,wa} = Obs(G_{FC,wa}^{nd})\). The red transitions are new ones compared to \(G_{FC}^{nd}\) and GFC.

Next, we compute Ga, Ha, and \(H_{a}^{CN}\) by following the steps at the beginning of Section 5.4. \(G_{a} = G_{S}^{\prime } \parallel G_{R} \parallel G_{FC,wa} \parallel G_{BC} \parallel G_{e}^{\prime }\) has 248 states, and \(H_{a} = G_{a} \parallel G_{sm}^{1}\) has 370 states and 228 marked states. \(H_{a}^{CN}\) is non-empty and consists of 1099 states and 771 marked states. In every case, \({\mathcal{L}}(H_{a}^{CN}) = {\mathcal{L}}(G_{a})\), so no disabling happens. As the example string in \(H_{a}^{CN}\), we pick \(send.p_{0}.p_{0}^{\prime }.deliver.a_{0}.p_{1}^{\prime }.deliver\) which is the same as that in Setup 1, but \(H_{a}^{CN}\) here is not equivalent. Let \((H_{a}^{CN})_{2}\) be \(H_{a}^{CN}\) here and \((H_{a}^{CN})_{1}\) be \(H_{a}^{CN}\) in Setup 1. Since \((H_{a}^{CN})_{2}^{comp} \times (H_{a}^{CN})_{1}\) is non-empty, we conclude that \((H_{a}^{CN})_{2}\) lacks some attack strategies, but one additional \(p_{1}^{\prime }\) in \(G_{FC,wa}^{nd}\) is enough to cause the violation of the safety property.

Setup 3

Let us make the attacker much less powerful than in Setup 2, by building a new automaton of the infiltrated forward channel and changing the sets of controllable and observable events.

Consider the new automaton of the infiltrated forward channel, depicted in Fig. 11. We denote this new automaton by \(G_{FC,a}^{oneshot,nd}\) and its observer by \(G_{FC,a}^{oneshot}\), namely \(G_{FC,a}^{oneshot} = Obs(G_{FC,a}^{oneshot,nd})\). This forward channel means that the attacker can send a fake packet with bit 1 to the ABP receiver only once (one-shot attacker). After the fake packet, the channel’s behaviour will get back to normal. Moreover, we consider the following controllable and observable event sets:

  • Controllable events: \(E_{a,c} = \{p_{1}^{\prime }\}\)

  • Observable events: \(E_{a,o} = \{p_{0}, p_{1}, p_{0}^{\prime }, p_{1}^{\prime }, a_{0}, a_{1}, a_{0}^{\prime }, a_{1}^{\prime }\}\)

meaning that the attacker can observe events in both of the channels, but can only control \(p_{1}^{\prime }\) in the (infiltrated) forward channel. By following the procedure as we have done, Ga in Eq. 29, where \(G_{FC,a} = G_{FC,a}^{oneshot}\), has 334 states. Also, Ha = GaGsm1 has 190 marked states out of 431 states, and \(H_{a}^{CN}\) is non-empty. Moreover, \({\mathcal{L}}_{m}(H_{a}^{CN}) \neq {\mathcal{L}}_{m}(H_{a})\) and \({\mathcal{L}}(H_{a}^{CN}) \neq {\mathcal{L}}(G_{a})\), thus the attacker issues event disablement actions during its attack on the system. For illustration, we pick the following example string in \(H_{a}^{CN}\):

$$send.p_{0}.p_{0}^{\prime}.deliver.a_{0}.a_{0}^{\prime}.done.send.p_{1}.p_{1}^{\prime}.deliver.a_{1}.a_{1}^{\prime}.done .send.p_{0}.p_{0}^{\prime}.deliver.a_{0}.{a_{1}^{\prime}}.{p_{1}^{\prime}}.deliver.a_{1}$$

By observation, the second \(a_{1}^{\prime }\) is of nonadversarial error packets which are sent mistakenly, and the second \({p_{1}^{\prime }}\) is inserted by the attacker. Note that the attacker can observe \(p_{1}^{\prime }\) and \(a_{1}^{\prime }\) here. Accordingly, this string means that the attacker can lead the system to the undesired state by sending the fake packet \(p_{1}^{\prime }\) only once after the observation of one error packet. Moreover, the attacker disables \(p_{1}^{\prime }\) several times before sending the fake \(p_{1}^{\prime }\). Therefore, in this case, the violation is caused “by chance”, since the attacker exploits errors, but that violation is enabled by the attacker’s intervention. It is worth mentioning that if we remove the events in the backward channel (i.e., a0, a1, \(a_{0}^{\prime }\) and \(a_{1}^{\prime }\)) from Ea,o, then \(H_{a}^{CN}\) is empty. This means that the attacker needs to observe the behaviour of the backward channel so as to exploit nonadversarial errors to attack. Moreover, if we set \(E_{a,c} = \varnothing\) and \(E_{a,o} = \{p_{0}, p_{1}, p_{0}^{\prime }, p_{1}^{\prime }, a_{0}, a_{1}, a_{0}^{\prime }, a_{1}^{\prime }\}\), then \(H_{a}^{CN}\) is empty again, meaning that the attacker needs to have the controllability of \(p_{1}^{\prime }\) to attack successfully.

5.4.2 Attack against nonblockingness properties

Setup 4

Consider that the attacker wants the system to violate the nonblockingness property represented by the nonblockingness monitor Gnm in Fig. 7. Let us examine the system under attack where the forward channels are infiltrated by the attacker, namely Ga in Eq. 29. Note that the forward PITM channel here is that in Fig. 8a which is quite powerful. Since Gnm is given as a dedicated automaton, we build Ha = Trim(Gother,aGnm) where Gother,a = Ga.

In this case, Ga consists of 174 states, and Ha comprises 14 states and 13 marked states. \(H_{a}^{CN}\) is non-empty and consists of 10 states and 9 marked states. As the example string in \(H_{a}^{CN}\), we pick string send.p0.p1.a1.timeout which means that the attacker sends a fake packet with bit 1 to the ABP receiver after it observes p0, and expects the system to suffer from timeout. Moreover, from \(H_{a}^{CN}\), the attacker-supervisor disables \(p_{0}^{\prime }\) to prevent deliver, resulting in \({\mathcal{L}}(H_{a}^{CN}) \neq {\mathcal{L}}(G_{a})\). Therefore, there exist no deliver transitions in \(H_{a}^{CN}\). This result shows that the attacker successfully leads the system to violate the nonblockingness property that send should eventually be followed by deliver.

6 TCP case study

Our second case study concerns one of the major protocols in the Internet, the Transmission Control Protocol (TCP) (Postel 1981). TCP is widely used to communicate through unreliable paths. We consider a communication architecture as in Fig. 2a. Each peer sends and receives packets to and from channels, and the network interconnects channels to relay the incoming packets to their destinations. As in Von Hippel et al. (2020a), we consider the connection establishment phase of TCP, based on three-way handshake, and do not model the congestion control part of that protocol.

6.1 Component models of TCP

Let Gnom in Eq. 6 be the entire connection establishment part of TCP without an attacker. Based on the architecture of TCP introduced in Von Hippel et al. (2020a), we consider Gnom as the parallel composition of the following components:

  • GPA = (XPAEPAfPAxPA,0XPA,m): Peer A

  • GPB = (XPBEPBfPBxPB,0XPB,m): Peer B

  • GC1 = (XC1EC1fC1xC1,0XC1,m): Channel 1

  • GC2 = (XC2EC2fC2xC2,0XC2,m): Channel 2

  • GC3 = (XC3EC3fC3xC3,0XC3,m): Channel 3

  • GC4 = (XC4EC4fC4xC4,0XC4,m): Channel 4

  • GN = (XNENfNxN,0XN,m): Network

namely

$$G_{nom} = G_{PA} \parallel G_{PB} \parallel G_{C1} \parallel G_{C2} \parallel G_{C3} \parallel G_{C4} \parallel G_{N}$$
(32)

Hence, GC = GC1GC2GC3GC4 and Ge = GN, so Eq. 32 reduces to Eq. 7.

fvThe event sets are defined as follows:

$$\begin{array}{@{}rcl@{}} E_{PA} &=& \{listen_{A}, timeout_{A}, deleteTCB_{A}, SYN_{AC1}, SYN_{C2A}, \\ &&ACK_{AC1}, ACK_{C2A}, FIN_{AC1}, FIN_{C2A}, SYN\_ACK_{AC1}, SYN\_ACK_{C2A}\} \end{array}$$
(33)
$$\begin{array}{@{}rcl@{}} E_{PB} &=& \{listen_{B}, timeout_{B}, deleteTCB_{B}, SYN_{BC3}, SYN_{C4B}, \\ &&ACK_{BC3}, ACK_{C4B}, FIN_{BC3}, FIN_{C4B}, SYN\_ACK_{BC3}, SYN\_ACK_{C4B}\} \end{array}$$
(34)
$$\begin{array}{@{}rcl@{}} E_{C1} & =& \{SYN_{AC1}, SYN_{C1N}, ACK_{AC1}, ACK_{C1N}, FIN_{AC1}, FIN_{C1N}, SYN\_ACK_{AC1}, SYN\_ACK_{C1N}\} \end{array}$$
(35)
$$\begin{array}{@{}rcl@{}} E_{C2} & =& \{SYN_{NC2}, SYN_{C2A}, ACK_{NC2}, ACK_{C2A}, FIN_{NC2}, FIN_{C2A}, SYN\_ACK_{NC2}, SYN\_ACK_{C2A}\} \end{array}$$
(36)
$$\begin{array}{@{}rcl@{}} E_{C3} & =& \{SYN_{BC3}, SYN_{C3N}, ACK_{BC3}, ACK_{C3N}, FIN_{BC3}, FIN_{C3N}, SYN\_ACK_{BC3}, SYN\_ACK_{C3N}\} \end{array}$$
(37)
$$\begin{array}{@{}rcl@{}} E_{C4} & =& \{SYN_{NC4}, SYN_{C4B}, ACK_{NC4}, ACK_{C4B}, FIN_{NC4}, FIN_{C4B}, SYN\_ACK_{NC4}, SYN\_ACK_{C4B}\} \end{array}$$
(38)
$$\begin{array}{@{}rcl@{}} E_{N} &=& \{SYN_{C1N}, SYN_{C3N}, SYN_{NC2}, SYN_{NC4}, \end{array}$$
(39)
$$\begin{array}{@{}rcl@{}} &&ACK_{C1N}, ACK_{C3N}, ACK_{NC2}, ACK_{NC4}, FIN_{C1N}, FIN_{C3N}, FIN_{NC2}, FIN_{NC4}, SYN\_ACK_{C1N}, SYN\_ACK_{C3N}, SYN\_ACK_{NC2}, SYN\_ACK_{NC4}\} \end{array}$$

Hence

$$E_{nom} = E_{PA} \cup E_{PB} \cup E_{C1} \cup E_{C2} \cup E_{C3} \cup E_{C4} \cup E_{N}$$
(40)

The subscripts in the event names indicate the directions of packets. For example, “AC1” means packets from Peer A to Channel 1. Note that the subscripts “A” and “B” are added to “listen” and “deleteTCB” to make these events private.

Figures 12131415, and 16 depict the models of the above TCP components. GPA and GPB illustrate the sequence of three-way handshake and cleanup. We mark the states “closed”, “listen”, and “established” in the automata of the peers, because the peer should not stay in other states during communication, based on Postel (1981). We also mark all states in the automata of the channels and network, to prevent these automata from marking the system. Namely,

$$X_{C1,m} = X_{C1}, \quad X_{C2,m} = X_{C2}, \quad X_{C3,m} = X_{C3}, \quad X_{C4,m} = X_{C4}, \quad X_{N} = X_{N,m}.$$

6.2 Safety property models

In Von Hippel et al. (2020a), the safety/liveness property of interest is defined as a threat model (TM). TM explains the property using Linear Temporal Logic (LTL) (Baier and Katoen 2008). In this paper, we represent the required properties in Von Hippel et al. (2020a) as finite-state automata.

Von Hippel et al. (2020a) provides one threat model, TM1, for one relevant safety property of TCP. TM1 defines the safety property that if Peer A is at state “closed”, then Peer B should not be at state “established”, because both peers should consecutively reach their “established” states after beginning the connection handshake. Let \(G_{sm}^{TM1}\) be the safety monitor to capture the violation of TM1. We represent \(G_{sm}^{TM1}\) as the parallel composition of the automata in Fig. 16 where the marked states are only “closed” in Peer A and “established” in Peer B, namely \(G_{sm}^{TM1} = G_{PA}^{TM1} \parallel G_{PB}^{TM1}\), where \(G_{PA}^{TM1} = (X_{PA}^{TM1}, E_{PA}^{TM1}, f_{PA}^{TM1}, x_{PA,0}^{TM1}, X_{PA,m}^{TM1})\) and \(G_{PB}^{TM1} = (X_{PB}^{TM1}, E_{PB}^{TM1}, f_{PB}^{TM1}, x_{PB,0}^{TM1}, X_{PB,m}^{TM1})\). Note that

$$\begin{array}{*{20}l} X_{PA}^{TM1} = X_{PA}, \quad E_{PA}^{TM1} = E_{PA}, \quad x_{PA,0}^{TM1} = x_{PA,0}, \quad X_{PA,m}^{TM1} = \{closed\} \neq X_{PA,m}, \\ X_{PB}^{TM1} = X_{PB}, \quad E_{PB}^{TM1} = E_{PB}, \quad x_{PB,0}^{TM1} = x_{PB,0}, \quad X_{PB,m}^{TM1} = \{established\} \neq X_{PB,m} \end{array}$$

Hence, the marked states in \(G_{sm}^{TM1}\) are illegal states, capturing that Peer A is at “closed” and Peer B is at “established” simultaneously.

Since the safety monitor for TM1, \(G_{sm}^{TM1}\), is derived from GPA and GPB, Gother in Algorithm 1 is the parallel composition of the automata of the channels and network, namely Gother = GC1GC2GC3GC4GN. Let Hnom in Algorithm 1 be a nominal specification automaton (without attacker) for TM1. In our system model of TCP, \(H_{nom} = Trim(G_{nm}^{TM1} \parallel G_{other})\) has no marked states, thus we conclude that our TCP model, without attackers, is correct in terms of TM1.

6.3 Nonblockingness property models

Von Hippel et al. (2020a) also provides two liveness properties denoted as TM2 and TM3. TM2 defines the liveness property that Peer 2 should eventually reach the “established” state. TM3 requires that both peers should not get stuck except at “closed” state, that is, no deadlocks except at “closed” state are allowed. Both TM2 and TM3 requires the system to remain alive during the communication process. In our case study, we translate TM2 and TM3 into “equivalent” nonblockingness properties as expressible representations in the SCT framework, thus slightly abusing the notations “TM2” and “TM3” in Von Hippel et al. (2020a).

We construct the nonblockingness monitors of TM2 and TM3, \(G_{nm}^{TM2}\) and \(G_{nm}^{TM3}\), by following Section 4.2. In this case, the nonblockingness monitors are not given as dedicated automata, thus we construct \(G_{nm}^{TM2}\) and \(G_{nm}^{TM3}\) based on Gnom and Ga. We discuss the construction of \(G_{nm}^{TM2}\) and \(G_{nm}^{TM3}\) in Section 6.5, because to build these automata, we rebuild Gnom and Ga as new automata according to TM2 and TM3.

6.4 Attack model

In this section, we explain the attack model for TCP. As we consider the system architecture in Fig. 2a for TCP, the attacker infiltrates the network. First, we construct a modified model of the plant Ga in Eq. 8 under attack. Since the network of TCP is under attack, we enhance GN to that under attack, GN,a = (XN,a,EN,a,fN,a,xN,a,0,XN,a,m), by adding new transitions and events to represent the capabilities of the attacker. Thus,

$$G_{a} = G_{PA} \parallel G_{PB} \parallel G_{C1} \parallel G_{C2} \parallel G_{C3} \parallel G_{C4} \parallel G_{N,a}$$
(41)

Figure 14 depicts the PITM attacked model of the network, GN,a, where “ATTK” is the set of events of outgoing packets from the network, namely

$$ATTK = \{SYN_{NC2}, ACK_{NC2}, FIN_{NC2}, SYN\_ACK_{NC2}, SYN_{NC4}, ACK_{NC4}, FIN_{NC4}, SYN\_ACK_{NC4}\},$$
(42)

representing multiple transitions, illustrated as the red transitions, by events in ATTK. Hence, the event set of GN,a, EN,a, is as follows:

$$E_{N,\;a} = ATTK \cup E_{N}$$
(43)

where EN is in Eq. 40. This allows the attacker to be flexible so that the attacker can send any packets and freely choose the destination of packets. As in the discussion in Section 3.1 and in the ABP model, we suppose that the attacker cannot control and observe events outside the network. Hence, the event set of Ga, Ea, is partitioned for controllability and observability of the attacker as follows:

  • Controllable events: Ea,c = ATK in Eq. 42

  • Uncontrollable events: Ea,uc = EnomEa,c

  • Observable events: Ea,o = EN,a in Eq. 43

  • Unobservable events: Ea,uo = EnomEa,o

In our attack model, the attacker controls the outgoing packets from the network, to lead the safety/nonblockingness monitor to reach its marked (illegal) state.

6.5 Examination of the PITM attack for TCP

In this section, we examine whether a For-all attack exists in terms of TM1, TM2, and TM3. As in Section 5.4, we try to synthesize a For-all attack by the following procedure:

  1. 1.

    Construct the plant under attack Ga in Eq. 41.

  2. 2.

    Using Algorithm 3, compute the realization of a For-all attack-supervisor with respect to Gnom, Ga and the safety/nonblockingness monitor for TCP.

As done in Section 5.4, if Algorithm 3 returns the realization, we pick one example string from the initial state to one marked state in \({{\mathcal{L}}_{m}(H_{a})^{\uparrow CN}}\), which represents one system behaviour under attack that reaches the marked state in the monitor.

6.5.1 Threat model 1 with channels

Setup 1

Let us consider a powerful attacker represented by GN,a in Fig. 14. By following the above procedure, Ga has 118761 states and 6307 marked states, and Ha has 38270 states and 704 marked states.

Next, we compute \(H_{a}^{CN}\) with respect to Ga and Ha by following the procedure for TM1. As a result, \(H_{a}^{CN}\) is non-empty, having 52783 states and 626 marked states, and \({\mathcal{L}}_{m}(H_{a}^{CN})\) contains the string

$$SYN_{BC3}.SYN_{C3N}.SYN\_ACK_{NC4}.SYN\_ACK_{C4B}.ACK_{BC3}$$

which steers \(G_{sm}^{TM1}\) to its marked states. Therefore, we conclude that there exists a For-all attacker SP defined in Eq. 25 with respect to Ga and Ha in this setup. From \({\mathcal{L}}(G_{a}) \neq {\mathcal{L}}(H_{a}^{CN})\), the attacker disables some transitions by controllable events in Ga, to always eventually win.

6.5.2 Threat model 1 without channels

Setup 2

One may find that in our TCP model, the channels just relay the incoming packets to their destinations, without any deletion or manipulation of packets. Since we assume ideal channels, we can reduce the communication architecture in Fig. 2a to that without channels, namely the architecture in Fig. 3.

Fig. 3
figure 3

Communication overview without channels

Due to the removal of the channels, to assure the synchronization of the peers and network in the parallel composition, we rename the subscripts of the events in EPA in Eq. 33, EPB in Eq. 34, EN in Eq. 40, and EN,a in Eq. 43, as follows:

$$\begin{array}{@{}rcl@{}} &&AC1 \to AN, \quad C2A \to NA, \quad BC3 \to BN, \quad C4B \to NB, \\ &&C1N \to AN, \quad NC2 \to NA, \quad C3N \to BN, \quad NC4 \to NB \end{array}$$
(44)

According to this change, the new Gnom and Ga are as follows:

$$\begin{array}{@{}rcl@{}} G_{nom} &=& G_{PA} \parallel G_{PB} \parallel G_{N} \end{array}$$
(45)
$$\begin{array}{@{}rcl@{}} G_{a} &=& G_{PA} \parallel G_{PB} \parallel G_{N,a} \end{array}$$
(46)

Gnom in Eq. 45 is trim, consisting of 41 states and 5 marked states, and Ga in Eq. 46 comprises 580 states and 27 marked states, and is not trim. Since we removed the automata of the channels from our system model, Gother and Gother,a in Algorithm 1 are equal to GN and GN,a, respectively. Even after the removal of the channels, Hnom has no marked states.

Noting that \(E_{a,c} \subseteq E_{a,o}\) still holds after renaming, let us revisit the procedure at the beginning of Section 6.5 for the construction of Ha and the computation of \(H_{a}^{CN}\) with the new Ga. In this setup, Ha consists of 547 states and 3 marked states, \(H_{a}^{CN}\) with respect to Ga and Ha is non-empty with 513 states and 3 marked states. \({\mathcal{L}}_{m}(H_{a}^{CN})\) contains the following string:

$$SYN_{BN}.SYN_{NB}.ACK_{BN}.ACK_{NB}$$
(47)

where SY NNB and ACKNB are fake packets inserted by the attacker, tricking Peer B into reaching “established” whereas Peer A does not move out from “closed”. Finally, from \({\mathcal{L}}(H_{a}^{CN}) \neq {\mathcal{L}}(G_{a})\) and non-trim Ga, the attack-supervisor disables several transitions in Ga.

Setup 3

As we have done in the ABP case study, let us consider a less-powerful attacker than the previous setups. First, we change the controllable events for the attacker, Ea,c, as follows:

$$\begin{array}{@{}rcl@{}} E_{a,c} &=& \{SYN_{AN}, SYN\_ACK_{NB}\} \end{array}$$
(48)
$$\begin{array}{@{}rcl@{}} E_{a,uc} &=& E_{a} \setminus E_{a,c} \end{array}$$
(49)

whereas Ea,o and Ea,uo do not change. Note that \(E_{a,c} \subseteq E_{a,o}\) still holds. SY NAN in Ea,c means that the attacker can discard SYN packets coming from Peer A. Next, we redesign the infiltrated network by the attacker, GN,a, to represent the reduced capability of the attacker. Figure 15 indicates the model of an infiltrated network by a less powerful attacker, \(G_{N,a}^{w}\). The red transitions are where the attacker can take action.

From the change of GN,a to \(G_{N,a}^{w}\), we change Ga to the entire system under the less powerful PITM attack, namely \(G_{a} = G_{PA} \parallel G_{PB} \parallel G_{N,a}^{w}\), in this setup. As a result, the new Ga is not trim, consisting of 48 states, 7 marked states, and 1 deadlock state. Because \(G_{sm}^{TM1}\) is not different from Setup 2, \(G_{other,a} = G_{N,a}^{w}\) here. Therefore by following the same procedure as above, Ha comprises 47 states and 1 marked state, and \(H_{a}^{CN}\) with respect to Ga and Ha here is non-empty with 63 states and 2 marked states, containing the following string leading \(G_{sm}^{TM1}\) to its marked state:

$$SYN_{BN}.SYN\_ACK_{NB}.ACK_{BN}$$
(50)

In conclusion, there still exists a For-all attacker with the less-powerful PITM model.

From \(G_{N,a}^{w}\) in Fig. 15, the attacker can send a fake SYN_ACK packet to Peer B only when Peer B enters “SYN sent” state, and the attacker must keep Peer A at “closed” state. Hence, the attacker must disable SY NAN at “closed” state in GPA shown in Fig. 16 where the subscripts of events are changed as in Eq. 44, and \({\mathcal{L}}(H_{a}^{CN}) \neq {\mathcal{L}}(G_{a})\) reflects this disablement action. Therefore, if SY NAN is uncontrollable, then \(H_{a}^{CN}\) is empty.

6.5.3 Threat model 2

Consider GPA, GPB, GN, and GN,a in Setup 2. Recall that Threat Model 2 (TM2) requires Peer A to reach its “established” state eventually. To design the nonblockingness monitor which captures the violation of TM2, we first unmark all states of GPA and mark its “established” state. Let \(G_{PA}^{TM2}\) be a new automaton derived from GPA in Fig. 16a by this marking and renaming as in Eq. 44. In contrast to the construction of safety monitors, \(G_{PA}^{TM2}\) captures the desired behaviour where Peer A reaches its “established” state eventually. Thus we construct Gnom and Ga as follows:

$$\begin{array}{@{}rcl@{}} G_{nom} &= G_{PA}^{TM2} \parallel G_{PB} \parallel G_{N} \end{array}$$
(51)
$$\begin{array}{@{}rcl@{}} G_{a} &= G_{PA}^{TM2} \parallel G_{PB} \parallel G_{N,a} \end{array}$$
(52)

To prevent it from marking Gnom and Ga, we mark all states in GPB, so the marked states of Gnom and Ga are determined by the “established” state in \(G_{PA}^{TM2}\).

Setup 4

Let us construct Ha by following Algorithm 2. First of all, Gnom in Eq. 51 is trim, thus the system model without attacker is correct in terms of TM2, meaning that Peer A eventually reaches its “established” state. So, let us proceed to the next step. From the additional transitions of GN,a in Fig. 14, Ga in Eq. 52 is not trim, thus Ga contains several deadlock and/or livelock states. In this scenario, we build \(G_{nm}^{TM2}\) for TM2 based on Ga and not as a separate automaton. In Ga, there are 25 deadlock states. These deadlock states are those the attacker wants Ga to reach so that Peer A cannot always reach its “established” state. To design \(G_{nm}^{TM2}\) representing the violation of TM2, namely reaching the deadlock states, we unmark all states in Ga and then mark all the deadlock states. Hence, let \(G_{nm}^{TM2}\) be the new automaton built by the marking of deadlock states in Ga, so that every string in \({\mathcal{L}}_{m}(G_{nm}^{TM2})\) ends with one of the deadlock states in Ga. Finally, the specification automaton for the attacker is \(H_{a} = Trim(G_{nm}^{TM2})\).

In this case, Ha consists of 580 states and 25 deadlock states which are determined by Ga, and \(H_{a}^{CN}\) with respect to Ga and Ha is non-empty, where \({\mathcal{L}}_{m}(H_{a}^{CN})\) contains the following string:

$$SYN_{AN}.SYN\_ACK_{NA}.ACK_{AN}.FIN_{NA}.SYN_{BN}.SYN_{NB}.ACK_{AN}$$
(53)

SY N_ACKNA, FINNA, and SY NNB in Eq. 53 are fake packets inserted by the attacker. This string makes Peer A and Peer B stuck at “close wait” state and at “i1” state, respectively. Here, \({\mathcal{L}}(H_{a}^{CN}) = {\mathcal{L}}(G_{a})\), thus the attacker just inserts fake packets and does not disable any controllable events. In conclusion, there exists a For-all attack for TM2 in this setup.

6.5.4 Threat model 3

In this section, we examine whether any For-all attacks against the Threat Model 3 (TM3) exist. TM3 captures the following nonblockingness requirement for the system: the peers should not suffer from any deadlocks if they leave “closed” state.

Consider GPA, GPB, GN, and GN,a in Setup 2 again. Since TM3 is defined by a nonblockingness property, we design a nonblockingness monitor for TM3 similarly as a monitor for TM2, discussed in Section 6.5.3. According to TM3, we first unmark all states and mark “closed” state in GPA and GPB. Let \(G_{PA}^{TM3}\) and \(G_{PB}^{TM3}\) be the new automata derived from GPA and GPB in Fig. 16 by this marking and renaming as in Eq. 44, respectively. Since \(G_{PA}^{TM3}\) and \(G_{PB}^{TM3}\) capture the desired behaviour of the system model, we construct Gnom and Ga as follows:

$$\begin{array}{@{}rcl@{}} G_{nom} &=& G_{PA}^{TM3} \parallel G_{PB}^{TM3} \parallel G_{N} \end{array}$$
(54)
$$\begin{array}{@{}rcl@{}} G_{a} &=& G_{PA}^{TM3} \parallel G_{PB}^{TM3} \parallel G_{N,a} \end{array}$$
(55)

Since all states in GN and GN,a are marked, the marked states in Gnom and Ga are determined by “closed” state of \(G_{PA}^{TM3}\) and \(G_{PB}^{TM3}\).

Setup 5

We construct Ha using Algorithm 2. First, Gnom in Eq. 54 consisting of 41 states and 1 marked state is trim, thus our system model without attacker is correct in terms of TM3. This means that neither Peer A nor Peer B suffers from deadlocks and/or livelocks when they are not at “closed” state. In the next step, due to GN,a, Ga in Eq. 55 comprising 580 states and 3 marked states is not trim, thus Ga contains deadlock and/or livelock states. In particular, Ga has 25 deadlock states and no livelock states. Since the nonblockingness monitor for TM3, \(G_{nm}^{TM3}\), is not given as a dedicated automaton, \(G_{nm}^{TM3}\) is derived from Ga by unmarking all states and marking the 25 deadlock states in Ga. Finally, we have \(H_{a} = Trim(G_{nm}^{TM3})\).

As a result, Ha in this setup consists of 580 states and 25 marked (deadlock in Ga) states, and \(H_{a}^{CN}\) with respect to Ga and Ha is non-empty with 660 states and 25 marked states. To see a behaviour of the system under the attack, we pick the following example string in \({\mathcal{L}}_{m}(H_{a}^{CN})\):

$$listen_{A}.SYN_{BN}.SYN_{NA}.SYN\_ACK_{AN}.ACK_{NA}.FIN_{AN}.ACK_{NA}$$
(56)

where the fifth and seventh ACKNA are fake packets sent from the attacker to Peer A. This string makes Peer A and Peer B stuck at “FIN wait 2” and “SYN sent”, respectively. Here, \({\mathcal{L}}(H_{a}^{CN}) = {\mathcal{L}}(G_{a})\), thus the attacker inserts fake packets and does not disable any controllable events. To sum up, there exists a For-all attack for TM3 in this setup.

7 Conclusion

We investigated the synthesis problem of For-all attacks under which the attacker can always eventually win, in the specific context of person-in-the-middle attacks on two well-known communication protocols, ABP and TCP, where in each case a sender and a receiver communicate over channels and a network. We formulated this problem in the framework of discrete event systems in order to leverage its supervisory control theory for attacker synthesis. We showed that the synthesis of a For-all attack can be formulated as the problem of finding a maximal controllable and observable sublanguage of the specification language for the attacker with respect to the given plant and the capabilities of the attacker in terms of controllable and observable events. The plant is the combination of the models of the sender, receiver, channels, and network. The specification language for the attacker is derived from a suitable specification automaton; we described in Sections 4.1 and 4.2 how to construct that automaton for various examples of safety properties and nonblockingness properties, respectively. The goal of the attacker is to force a violation of the given safety or nonblockingness property of the communication protocol. We formally derived in Sections 5 and 6, when they existed, several For-all person-in-the-middle attacks for ABP and TCP under different scenarios of attacker capabilities and safety or nonblockingness property to be violated. We are not aware of any prior work where formal methods are used to synthesize attacks on ABP. For the case of TCP, our results extend the results in Von Hippel et al. (2020a), where the authors considered the synthesis of There-exists attacks under which the attacker may not always win, but will sometimes win. In total, we presented four setups for ABP and five setups for TCP, where the plant, specification, and event partitions vary. Further setups are discussed in the expanded version of this paper available at Matsui and Lafortune (2022).

In the PITM attack setups we considered, it was reasonable to assume that the attacker observes all the events it controls. Hence, the synthesis of a For-all attack reduced to the computation of the supremal controllable and normal sublanguage in supervisory control theory of discrete event systems. This means that the methodology that we employed for ABP and TCP could be applied to other protocols and other types of attacks that can be modelled as additional transitions in the transition structure of the protocol. This shows that formulating attacker synthesis as a supervisory control problem is a powerful approach in the study of vulnerabilities of distributed protocols. In the future, it would be of interest to investigate how to make distributed protocols more resilient to both There-exists and For-all attacks.