Keywords

1 Introduction

A cyber-physical system (CPS) comprises of physical infrastructure that is controlled by computation and communication frameworks. It includes a combination of interconnected components such as Programmable Logic Controllers (PLCs), sensors, actuators, a Supervisory Control and Data Acquisition (SCADA) workstation, and a Human Machine Interface (HMI) that communicate across a network. The PLCs check the present state of the system through the SCADA and implement the corresponding control actions to facilitate the proper progress and functioning of the sub-processes.

The normal operation of a CPS requires the network and physical elements to work in tandem, for they directly influence the physical processes. Communication among such industrial IoTs is helpful but it also exposes them to malicious entities  [1, 2]. This makes the design of security measures for a CPS more complicated as compared to those meant for pure IT systems because attacks can occur in both the cyber and physical domain  [3].

Since an inter-connected CPS also incorporates wireless communication, the infrastructure is prone to remote breaches and attacks  [4]. This can be detrimental as it endangers the crucial communication links between the different nodes in a CPS, allowing them to be manipulated by external entities. By influencing the underlying processes in a CPS, cyber attacks could sabotage its physical infrastructure. Physical attacks can damage the sensors or other devices, which compromises the integrity of the data. This is a major risk as it results in faulty data being forwarded to the controllers, which adversely affects the control actions that are computed based on it. Conventionally, security research is focused on detecting anomalies in the communication network part of a CPS  [5]. However, physical attacks can be more difficult to detect as they may not be reflected in the system network  [6].

In this work, case studies are done on a water treatment testbed and a water distribution testbed, wherein model-based approaches for attack detection are considered. The sensor and control data from these plants under normal operation is used to derive Linear Time-Invariant (LTI) system models. These models are created using a control-theoretic approach, thus allowing the physical dynamics of the underlying processes to be captured analytically. The attack detection methods are then applied to the residual (the difference between the estimated and actual sensor values).

The detection performances of three attack detection techniques are evaluated in this paper. The first two methods are statistical change detectors called Cumulative Sum (CUSUM) and Bad-Data detectors that identify instances of abnormal data using empirically determined thresholds. The third technique is a machine learning-based device fingerprinting method called NoisePrint  [3].

While gauging the performance, apart from precision, another important consideration for the attack detection techniques is their sensitivity. This refers to their tendency of raising false alarms when the plants operate normally. This is vital due to its implications in practical scenarios, wherein a system of numerous physical components needs to be checked. Hence, the detection mechanisms are evaluated under normal operating conditions as well as when the plants are under several attacks to acquire a comprehensive understanding of their performance.

The motivation for this work is to exhaustively test and compare attack detection techniques for CPS on different testbeds. The implementation of such methods on real-world systems is able to provide some useful insights to address the following issues:

  1. 1.

    Impact of Noise on System Models: The implementation and verification of theoretical models brought up some problems, one of them being the noise from the process for each different run. It can be seen that the effect of noise from the environmental disturbances on the process causes unpredictable deviations from its modelled behavior.

  2. 2.

    Sensor Faults: One of the problems was the unseen faults in sensors even during the normal operation of the plant, which hindered the creation of useful system models. This means that during the data collection under normal operation, the components must be thoroughly checked to ensure that all of them are functioning properly.

  3. 3.

    Data Availability and Reliability: Data availability plays a vital role in the design and performance of an anomaly detector. Prior to model creation, it is necessary to procure sufficient data that (a) represents the components’ entire performance cycle, and (b) covers all possible modes of the operation of the Industrial Control System (ICS) in the absence of momentary glitches and outliers. In general, when a dataset is created for a study, the plant is run continuously under normal operating conditions. The same has been done in this study for obtaining the data to create the models. However, when these models were tested on the plant when it was not running, unexpected outcomes were observed.

  4. 4.

    Attack Detection Speed: The speed with which a process anomaly is detected is of prime concern for the safety of the plant, but it is often ignored as a performance attribute [7]. Rapid detection allows for appropriate actions to be taken earlier, thereby mitigating the impact. Therefore, Time Taken for Detection (TTD) has been used as an important performance metric in this study, while highlighting its significance.

Organization: The remainder of this paper is organized as follows. The mathematical modelling of the two testbeds as systems is explained in Sect. 2. The attack detection framework in Sect. 3 briefly explains the working of the three detection techniques that form the focus of this paper. Following this, Sect. 4 defines the attacker profile while detailing the potential attack scenarios and their execution. The performance of the detection mechanisms is evaluated in Sect. 5, whereby the techniques are tested under normal and attack conditions. Based on the analysis of the results obtained, the conclusions that map to the contributions above, are presented in Sect. 6.

2 System Model

2.1 Two Testbeds: Our Playground

Research facilities with operational testbeds of prevalent cyber-physical systems have been utilised to implement the security strategies and test their capabilities. As mentioned earlier, these include a secure water treatment plant (SWaT)  [8] and a water distribution plant (WADI)  [9]. These are operational, scaled down plants that simulate the larger industrial infrastructure found in cities today. The physical process here is that of water flow, wherein it undergoes specific processes, for e.g., ultra-filtration, reverse osmosis, etc. The plants are divided into different stages, each carrying out a specific sub-process. The detailed workings of the testbeds are explained in papers  [8, 9].

2.2 System Models

Each of the two testbeds is treated as a multi-input, multi-output system, following the model-based approach. A system model represents the dynamics of a physical process using a mathematical formulation. Sub-space system identification techniques are used to obtain models of the following form, for a system with p control inputs (actuators) and m outputs (sensors):

$$\begin{aligned} \left\{ \begin{array}{ll} x_{k+1} = Ax_k + B u_k + v_k, \\ y_k = Cx_k + \eta _k. \end{array} \right. \end{aligned}$$
(1)

where k represents the time instance, \(x \in \mathbb {R}^n\) is system state vector of n states, \(A \in \mathbb {R}^{n \times n}\) is state-space matrix, \(B \in \mathbb {R}^{n \times p}\) is the control matrix, \(y \in \mathbb {R}^m\) is the vector of the measured outputs, \(C \in \mathbb {R}^{m \times n}\) is measurement matrix, and \(u \in \mathbb {R}^p\) denotes the system control input.

The state-space matrices AB and C capture the system dynamics and can be used to find a specific system state given an initial state. The sensor and process noise vectors are represented by \(\eta _k\) and \(v_k\), respectively.

2.3 Validation of the System Models

It is necessary to validate the models created for each of the systems. For this, the state-space matrices from the system identification process are applied and the estimates for the output of the system are obtained. These modelled values and real-time sensor measurements are then compared. The difference between the measured sensor values and estimates is considered using the root mean square error (RMSE). The RMSE value for N readings is given as follows:

$$ \text {RMSE} = \sqrt{\frac{\sum _{i=1}^{N} \big ( y_i -\hat{y}_i \big )^2}{N}}. $$

where \(y_i\) is the actual i-th sensor reading, and \(\hat{y}_i\) is the i-th model estimate.

The accuracy of the system identification-based model for 6 sensors in the SWaT testbed is shown in Table 1 as an example, and it can be seen this model has high accuracy. In control theory literature, models with accuracy as high as 70% are considered a sufficiently precise approximation of real system dynamics  [10, 11].

Table 1. Validating SWAT model obtained from sub-space system identification.

3 Attack Detection Framework

This work focuses on detecting attacks on sensors, primarily by validating the incoming readings. This is done by (1) estimating the sensor output using the system model, and (2) examining the residual between the actual and estimated values and verifying the source of the sensor readings. The second step is in turn done using the three different detectors (CUSUM, Bad-Data and NoisePrint) for comparison.

System Model and Estimation: The concept of creating system models is explained in the previous section. These can be obtained either using data-based techniques or from first principles  [12,13,14]. Using the system model, it is possible to estimate the states of the system and ultimately predict the output from a sensor applying Eq. 1. At a time instance k, a residual vector (\(r_k\)) is calculated by taking the difference between the sensor measurements (\(y_k\)) and estimated sensor output (\(\hat{y}_k\)), which is given as:

$$\begin{aligned} r_k = y_k - \hat{y}_k. \end{aligned}$$
(2)

For the residual, the hypothesis testing is for , the normal mode (no attacks), and , the faulty mode (with attacks). The residuals are obtained using this data and the state estimates. The two hypotheses are stated as follows:

figure a

Threshold-Based Detection: To detect the presence of an attack, the residual vector is tested against a predefined threshold designed for a particular false alarm rate. A threshold is created for the residual distribution, and while testing the model against the actual data from the plant, an attack is declared if the residual values exceed that threshold:

$$\begin{aligned} |r_k| > \tau , \; Alarm = \text {TRUE} \end{aligned}$$
(3)

where \(\tau \) is the threshold and \(|r_k|\) is the absolute value of the residual. There have been studies on optimizing the parameters of different stateful and stateless detectors  [13, 14]. Next, the three attack detection techniques deployed in this study are outlined.

3.1 Cumulative Sum (CUSUM) Detector

The standard CUSUM  [15] procedure is explained using the following equations.

figure b

From Eqs. 45, it can be observed that the CUSUM values \(S^+_{k,i}\) and \(S^-_{k,i}\) accumulate the distance measured \(r_{k,i}\) over time to measure how far the values of the residual are from the target mean (\(\bar{T}_i\)). The slack variable \(\kappa \) can be adjusted to tune this window for error. The parameters are chosen suitably to achieve a required false alarm rate .

3.2 Bad-Data Detector

The Bad-Data detector is widely used in the CPS security literature  [16].

figure c

Using the Bad-Data detector, an alarm is triggered if distance measure, taken as \(|r_{k,i}|\), exceeds the threshold \(\alpha _i\). Analogous to the CUSUM procedure, the parameter \(\alpha _i\) is selected to satisfy a required false alarm rate .

3.3 NoisePrint (Machine Learning-based Device Fingerprinting)

NoisePrint is a sensor fingerprinting technique that makes use of a Support Vector Machine (SVM) [3]. It is based on the principle that when the system is in steady state  [17], the residual vector of its model is a function of sensor and process noise. Therefore, it is possible to extract these sensor and process noise characteristics of the given ICS from the residual vectors. Following this, pattern recognition techniques such as machine learning are applied on the residual vectors to fingerprint the given sensor and process.

The proposed scheme begins with data collection which is then divided into smaller chunks to extract a set of time domain and frequency domain features. Features are combined and labeled with a sensor ID. A machine learning algorithm is used for classifying sensors based on their noise profiles. For more details, an interested reader is referred to  [3, 18].

4 Threat Model

Since the attacks taken into consideration for this work are on sensors, a few assumptions have been made about the attacker. These are given as follows:

  1. 1.

    The attacker has access to \(y_{k,i} = C_i x_{k} + \eta _{k,i}\) (i.e., the i-th sensor measurements at the \(k^{th}\) time instance).

  2. 2.

    The attacker has the knowledge about the system dynamics, the state-space matrices, the control inputs and outputs, and the implemented detection measure.

Tables 2 and 3 show the attacks carried out on SWaT and WADI. Based on their execution, these can be classified as follows:

  • Single-point Attack—these types of attacks target a single point in the system, manipulating its value and/or disrupting its communication link.

  • Multi-point Attack—in these types of attacks, multiple points are targeted simultaneously.

  • Stealthy Attack—these are the attacks wherein the data value of a sensor is altered very slightly, which makes it difficult to detect the abnormality.

Table 2. List of attacks (SWaT): column 1 states the attack ID, and column 2 provides the details, wherein the ‘/’ separates the system state before and during the attack.

The single- and multi-point attacks, in turn, can be single-stage or multi-stage. In single-stage attacks, the attack points are limited to one particular stage of the plant, whereas in multi-point attacks, the target points can be spread across several stages. In real scenarios, these are dependent on the attacker’s competence, extent of access and intentions.

Table 3. List of attacks (WADI): column 1 states the attack ID, and column 2 provides the details, wherein the ‘/’ separates the system state before and during the attack.

The attacks mentioned in Tables 2 and 3 simulate data injection attacks of two kinds:

  • Bias Injection Attack: The attacker’s goal in this type of attack is to deceive the control system by sending incorrect sensor readings. The attack vector in such a scenario can be defined as:

    $$\begin{aligned} \bar{y}_k = y_k + {\delta _k}, \end{aligned}$$
    (7)

    where \(\bar{y}_k\) is the general sensor measurement at a time instance k, \(y_k\) is the actual sensor reading and \({\delta _k}\) is the biased value injected by the attacker. For e.g., Atk-2-s in Table 2 is a simple attack wherein a bias is added to the LIT-101 reading such that the value read by the PLC is changed from the original, which is 659 mm, to a spoofed value of 850 mm. Similarly, in Atk-2-w in Table 3, the 2-FIT-001 value is changed from its original 0 m\(^3\)/h to a 1.5 m\(^3\)/h, and the control actions taken by the PLC are based on this fake value.

  • Stealthy Attack: In this case, the attack vector \({\delta _k}\) for Eq. (7) is chosen in a way that it stays inconspicuous while using statistical detectors. This happens because in these types of attacks, the residual vector may not noticeably change or exceed the thresholds, which is necessary for statistical detectors to confirm an attack. A example of a stealthy attack is Atk-1-s from Table 2. In this attack, the reading of LIT-101 is originally 659 mm, and during the course of the attack, a small bias is repeatedly injected such that this value gradually increases by 1 mm every second.

Such attacks are operational technology (OT) attacks that aim to compromise the normal performance of the plant by manipulating sensor and/or actuator states. The SCADA system coupled with the SWaT and WADI testbeds provides an option of manually altering the sensor/actuator values that are being sent to the PLCs, and this function has been used to simulate some of the simple bias injection attacks. For the more complicated attacks, customised Python programs have been developed that gradually change the attack vector to simulate a stealthy attack. Custom-coded modules developed at iTrust Labs  [19] have been used that are able to communicate with the LabVIEW-basedFootnote 1 SCADA interface in order to launch the stealthy attacks.

5 Performance Evaluation

5.1 Performance Metrics

The precision and sensitivity of the attack detection method are part of the criteria to analyse its effectiveness. The following metrics have been used to assess the three procedures:

  • True Positive Rate (TPR) and False Negative Rate (FNR)—The TPR refers to the number of times the method correctly raises alarms (predicts an attack) over the duration of the attack. The FNR is an alternate way of expressing the same metric:

    $$\begin{aligned} \text {FNR} = 100 \ \% - \text {TPR} \end{aligned}$$
  • False Positive Rate (FPR) or False Alarm Rate (FAR)—this refers to the number of times the method incorrectly raises alarms in the absence of any attack.

  • Time Taken for Detection (TTD)—this refers to the time taken by the procedure to raise an alarm in the event of an attack.

The TPR of the technique is a direct indication of its attack detection accuracy and must be as high as possible. The FPR represents the tendency of the procedure to raise false alarms, which is extremely inconvenient in practical scenarios, and should be satisfactorily small. A high TPR is not very beneficial if the mechanism takes too long to detect the attack. This is because in a realistic sense, the CPS performs critical, large-scale processes that influence the surrounding economy in multiple ways. A significant delay in the detection of an attack can be detrimental not only to the system itself, but also to its end-users. Therefore, the detection mechanism must have reasonable TTD.

In practical applications, there often exists a trade-off between a high TPR and a low FPR. A detection method may have a high FPR while managing to achieve a good TPR. Likewise, it is also possible to design for a low FPR but at the cost of missing some attacks, resulting in a low TPR. Hence, the two rates must always be balanced such that a satisfactory TPR is attained while having a feasible FPR.

5.2 Normal Operation

As emphasized earlier, attack detection mechanisms must be designed in a way such that they do not raise too many false alarms. Hence, the detection techniques were implemented on both the plants, and their performances were observed when the plants were under normal operation.

For both the plants, the thresholds for the CUSUM and Bad-Data detectors have been designed to allow an FPR of 5% (or less). This is done to account for the temporary aberrations caused by technical glitches or external disturbances, which often occur in practical industrial plants. Each detector has thresholds and design parameters dedicated to each sensor, which are presented in Tables 4 and 5. It can be seen in these tables that, for both plants, these two attack detection methods generate false alarms within a reasonable window around the designed limit.

Figure 1 shows the residual from the system identification-based model for the level sensor (2-LT-002) in WADI. It can be seen that it mostly remains below its Bad-Data threshold during normal operation, shown in Fig. 1a. Likewise, the CUSUM values also stay within the thresholds for 2-LT-002 under normal operation, as seen in Fig. 1b. This implies that the design of the Bad-Data and CUSUM thresholds is in accordance with the requirement and it is feasible to implement these detectors on the plants under normal operating conditions.

When tested on SWaT, NoisePrint performed very well, with low or zero FPRs for almost all of the sensors. However, in the case of WADI, the FPRs for most of the sensors were above the desired 5%. The sensors in WADI are known to be sensitive to disturbances from the environment, thus resulting in some faults in their measurements, and this could be the reason NoisePrint fails to perform well.

From these figures and tables, it can be concluded that the detection methods perform satisfactorily well on both the testbeds under normal operating conditions. The x-axis for all the figures is the time in seconds for which sensor data is plotted. However, it is to be noted that these figures are for demonstration purposes only and do not show the complete dataset. For the normal operation of the water plants, the dataset is collected for more than a week and the attack data ranges from 5–30 min for each attack  [20]. The FPR is only shown for the normal data evaluations. As for the case of the attack evaluation table in the following section, the data used was recorded only when the sensors were under attack, and hence shows FNR only. The rate (TPR) is calculated using the number of alarms raised for the whole duration of the attack.

Table 4. False positives under normal operation in SWaT.
Table 5. False positives under normal operation in WADI.

5.3 Attack Detection

The three detection techniques were tested under different attack scenarios on both the plants. Tables 2 and 3 show the attacks carried out on SWaT and WADI, respectively. The residuals for the sensors from the system identification-based models were obtained and the detection techniques were applied while the plants were under attack. The performance metrics were computed for the different attacks on each of the testbeds and can be seen in Tables 6 and 7.

In the case of SWaT, it can be seen in Table 6 that the CUSUM and Bad-Data detectors perform well under a variety of bias injection attacks, like Atk-11-s, Atk-4-s and Atk-5-s. However, they fail to detect the stealthy attacks Atk-17-s and Atk-6-s. Whereas, NoisePrint is able to successfully detect the presence of all attacks, including the stealthy attacks, and demonstrates a comparable TPR for other cases. The attacks that report poor TPR while using CUSUM and Bad-Data thresholds can be detected better using NoisePrint. However, the superior performance of NoisePrint comes at the cost of speed of detection. The time taken by the CUSUM and Bad-Data detectors to confirm the occurrence of the attack is considerably less than that of NoisePrint, implying that they have a better TTD compared to NoisePrint.

Fig. 1.
figure 1

Statistical attack detection methods applied on the residual for level sensor (2-LT-002) estimates from WADI under normal operation. X-axis shows number of sensor reading sampled at 1 s intervals.

Figure 2 shows the residual when the level sensor (LIT-101) in SWaT is under a stealthy attack. In this attack, an attacker chooses to spoof the sensor measurement at the same value as the last known normal reading, thus deceiving the controller, while the real process state continues to progress differently. As seen in Figure 2a, the residual stays below the threshold during the stealthy attack. Similarly, in Figure 2b, it can be seen that the CUSUM values also always stay below the CUSUM thresholds. This shows that the stealthy attack could not be detected by either of the two detectors. However, as mentioned in Table 6 NoisePrint is able to detect this attack.

In the case of WADI, when the CUSUM detector is implemented on the residuals obtained from the system models, unsatisfactory TPRs are reported for all the attacks, as shown in Table 7. The Bad-Data detector performs reasonably well for attacks Atk-2-w and Atk-7-w, while NoisePrint shows a 100% TPR for attacks Atk-2-w, Atk-3-w and Atk-7-w. Both methods report poor TPRs for the other attacks. Similar to the case of SWaT, the TTD of NoisePrint is much higher than that of the Bad-Data detector.

These results show that while the statistical detectors, Bad-Data and CUSUM, are successfully able to confirm basic attacks such as bias injections, they fail to detect the more complicated stealthy attacks. This is expected because stealthy attacks are devised such that they do not tend to cause substantial changes to the residuals obtained from models, thereby ensuring the thresholds that determine the presence of an attack are not crossed. On the other hand, NoisePrint is able to identify such attacks, since the attacker may not be able to replicate the process and sensor noise, which form the basis of detection in NoisePrint. However, despite achieving better accuracy, NoisePrint falls behind in terms of detection speed.

Given the nature and performance of the detection mechanisms, the practical applicability of the methods can be challenged. The testbeds used in this work are small-scale and hence, obtaining complete system models for them was a feasible task. This might not be the case for actual industrial CPSs. A possible solution to this would be dividing the larger plants into several sub-stages (based on the processes taking place) and having multiple models corresponding for each sub-system.

In the case of NoisePrint, its longer detection time might render it less efficient when applied to some industrial CPSs, such as power grids, which require immediate response during attacks or anomalies. However, its accuracy is an important advantage when it comes to large systems with several sensors, and the method is still be applicable to CPSs wherein the attacks could take a longer time to cause any physical harm.

Table 6. Attack detection performance on SWaT testbed.
Table 7. Attack detection performance on WADI (System identification model).
Fig. 2.
figure 2

Statistical attack detection methods (Bad-Data and CUSUM) applied on the residual for level sensor (LIT-101) estimates from SWaT under stealthy attack

6 Conclusions

From the model validation results, it is understood that the models generated using well-established system identification algorithms perform reasonably well. An important insight is that obtaining a normal reference system model for the plants and sensors sensitive to environmental disturbances (e.g., for the WADI testbed in this study) is a non-trivial task. It is deduced that bias injection attacks on sensors that are quite similar to faults can be easily detected using statistical techniques like Bad-Data and CUSUM detectors. However, it is observed that advanced stealthy attacks require more sophisticated detection techniques, like NoisePrint. From the various tests carried out on the plants, it is concluded that while detection methods must be able to demonstrate accuracy, their attack detection speed is also a crucial metric for critical CPSs.