Introduction

Safe and secure operation of the electrical power systems is a critical challenge and ranks as the highest priority of the stakeholders of the electricity markets. Besides inevitable malfunctions of the power grid components, deliberate disruptions caused by malicious attacks put the security of the power systems at high risk. Integration of the intelligent devices into the power grid operations has made the power grid increasingly reliant on the information and communication technologies. The integrated cyber-physical nature of the modern power systems has created a large and complex infrastructure that necessitates advanced cyber and physical security mechanisms. In this paper, we introduce the concept of the holistic resiliency cycle (HRC) that emphasizes the necessity of considering the power systems security problem holistically.

HRC is a systematic view to the security of the power systems, characterized by its four stages as closely interconnected and explicable only by reference to the whole. HRC includes four stages: (i) prevention and planning, (ii) detection, (iii) mitigation and response, and (iv) system recovery. We review the literature on cyber-physical security of the power systems and analyze them based on the HRC stages. The goal of the paper is to study the weaknesses and strengths of the power systems literature from the HRC perspective and enlighten the future research directions that enhance the cyber-physical security of power systems.

The rest of the paper is organized as follows. Section II investigates the physical security of power systems. Section III addresses the cyber security of power systems and evaluates the power systems resiliency against such attacks. Section IV reports our conclusions.

Physical Security of Power Systems

A report from Wall Street Journal revealed that 274 deliberate attacks to the power grid components occurred in 2011–2014 [1]. Physical attacks on power systems components not only disrupt the power supply to customers but also cause substantial economic burdens for the other stakeholders of the power sector such as utility companies, transmission system operators, and distribution system operators [2]. As a case in point, 17 large-scale power transformers were damaged in a recent attack to a substation in California on April 16, 2013, which also cost 27 days of repair time [2]. Damages to the critical power grid components may cause cascading outages and even blackouts [3]. Different protection mechanisms have been discussed to enhance the physical security of power grids. Intrusion detection devices, access controls, lighting, fencing, cameras, sensors, and buffer zone security are suggested as protection mechanisms with lower reliability and moderate cost investments [2]. A communication mechanism can be devised to alarm guards/police to accelerate the response time to intrusions and reduce the potential attack damages. More reliable protection mechanisms such as undergrounding or double circuiting of transmission lines require much higher investment costs [2]. Hence, protection of all grid components against physical attacks is impractical and economically un- justifiable.

Power grid resilience against physical attacks has attracted the interests of the research community as well. Salmeron et al. [4] proposed a bilevel mathematical model to identify the most disruptive attack scenario given that the attackers have resource limitations. Similarly, Donde et al. [5, 6] developed screening algorithms to identify contingencies that cause severe damage to the power grid. These proposed models identify the critical components for protection such that the damage caused by the most disruptive attack would reduce if the protection plan is implemented. The authors in [7,8,9] developed a variety of trilevel optimization models within the defender-attacker-defender framework for power network defense considering different scenarios and contingencies. These models devise the best resiliency plan when the attacker plots his attack with the perfect knowledge of the protected components. Multilevel optimization problems are complicated to solve. Salmeron et al. [10] applied decomposition methods to effectively solve such large-scale protection optimization models. Furthermore, a variety of game theory models, such as static games, leader-follower games, zero-sum Markov games, are proposed in [11,12,13,14,15,16,17] to tackle the defender-attacker problems for enhanced power systems physical security.

From the HRC perspective, the existing research on physical security of power systems has focused on the prevention and planning stage while taking into account mitigation of damages and response to potential attacks. The common shortcoming in these studies is the assumption that the protected components will be completely secure and no longer at risk, which limits the application of these models in the real world. Future research needs to address this issue and provide a more reliable solution. Furthermore, the widespread structure of the power grids makes the detection of a physical attack prior to its occurrence next to impossible unless protected with sensors, cameras, or guards. Last but not the least, the recovery stage of HRC on physical attacks has been barely studied in the literature. On a similar topic, power system recovery after natural disasters has been well studied that could be used as a benchmark for studying the power system recovery after physical attacks.

Cyber Security of Power Systems

Smart grid advancements have made cyber security a critical challenge for power systems operators. Data availability, data integrity, and data confidentiality are the main elements of cyber resiliency. Cyber attackers target these elements to manipulate the data being communicated for control and operations of the power systems in order to tamper with the grid, interrupt the safe operations of the power grid, gain financial advantage, or even damage the power grid physical structure. Many researchers, computer scientists in particular, have investigated prevention methods that keep cyber intruders away from the network devices and databases. Suo et al. [18•] reviewed and analyzed the state-of-the-art on cyber attack prevention technologies including encryption mechanisms, communication security, protecting sensor data, and cryptographic algorithms. To evaluate the state of cyber security of power systems from the HRC point of view, we further investigate methods and mechanisms proposed in the power systems literature for detection, response and mitigation, and recovery. In the next section, cyber attacks on power systems are studied and classified into two clusters: direct attacks and indirect attacks. Direct attacks target power systems databases and components whereas indirect attacks take advantage of the mutual dependency of power systems and Internet of things (IoT).

Direct Cyber Attacks to Power Systems

Direct cyber attacks are classified into four groups based on their functions as discussed below.

Data Intrusion Attacks

Data intrusion attacks are the most common group of cyber attacks threatening the security of power systems. Control mechanisms detect bad data caused by routine malfunctions of power systems devices such as imperfect measurements obtained from faulty sensors. However, a cyber attacker could gain access to power systems databases and shrewdly tamper with data such that the control center mechanisms cannot detect the anomaly. In general, there are three major types of data intrusion attacks, false data injection (FDI) attacks, load redistribution attacks (LRA), and denial of service attacks (DoS).

In FDI attacks, introduced by Liu et al. [19], the attacker gains access to the current power systems configurations and manipulates the stored data and measurements in order to lead power systems operators toward making wrong and potentially harmful decisions. Mousavian et al. [3] showed how FDI attacks to the optimal power flow (OPF) module could cause overloaded transmission lines and result in power outages and physical damages. The authors used artificial neural networks to develop a detection algorithm against FDI attacks on OPF module [3]. The authors in [20] analyzed FDI attacks on the state estimation module and provided a new detection algorithm using the state variable distribution. Similarly, Li et al. [21] studied the injection of malicious data to the monitoring meters of the state estimation and developed a sequential detection method using the generalized likelihood ratio. Furthermore, Moslemi et al. [22] utilized the near chordal sparsity of the power grid to obtain the associated maximum likelihood function and detect FDI attacks on the state estimation. Liu et al. [23] combined features of the network traffic flow of information and power systems physical laws to create a detection model called abnormal traffic-indexed state estimation for a higher detection rate of FDI attacks to the state estimation.

Khalid et al. [24] studied FDI attacks on transmission systems and proposed a multisensor track-level fusion based prediction model to improve the resiliency of the transmission systems against such attacks. Phasor measurement units (PMU) can measure synchronized phasors of bus voltages and currents of transmission lines in real time for better observability of the power grid [25]. A PMU takes about 30 to 120 measurements per second and sends its measurements to a phasor data concentrator (PDC) through a wireless communication network [26, 27]. PMUs, supposedly the trusted sensors of obtaining measurements for better resiliency and observability of the transmission systems, have been the target of the FDI attacks as well [28,29,30]. The authors in [30] presented a detection method using the majority voting algorithm in order to identify the compromised PMU which sends anomaly measurements. Waghmare et al. [31] proposed a two-stage detection method against FDI attacks to PMUs, which applies principal component analysis to reduce the high-dimensional datasets and use the support vector machine (SVM) method. SVM has also been used to detect FDI attacks to SCADA control system [32]. Similarly, He et al. [33] used deep learning methods and historical measurements data to detect FDI attacks on SCADA in real time.

A new class of FDI has been introduced in [34] as stealthy false data injection (SFDI) attacks. SFDI manipulates the gross errors from the measurement matrices such that the attack is undetectable by current detection schemes of the state estimation. Ashok et al. [35] has developed a detection algorithm against SFDI attacks on state estimation, which utilizes synchrophasor measurements, load forecasts, and generation schedules. Mohammadpourfard et al. [36] assumed that injecting false data into the system causes a deviation on the probability distribution of the state vector and proposed an unsupervised method for detecting SFDI on state estimation. Yang et al. [37] proposed a method to detect SFDI attacks on PMUs, in which neighborhood of sensors would detect the attack by constantly checking the state of the nodes and sending the rightness signals to the neighboring nodes. This method detects FDI in a smaller neighborhood of nodes, instead of the entire system, which gives the system operators the advantage of less computational complexity and faster detection. Mousavian et al. [38•] took one step further and developed a risk mitigation response to SFDI attacks to PMUs. They developed a mixed integer linear programming model that avoids or optimally slows down the propagation of cyber attacks while keeping the power systems observable. A similar study has been conducted for responding to SFDI attacks in the electric vehicles power stations network [39, 40]. Lin et al. [41] extended the response model to PMU networks, discussed in [38•] and proposed a self-healing strategy for PMU networks. Load redistribution attacks, introduced in [42], is a special case of the SFDI attacks in which the attacker manipulates the loads data collected for state estimation such that the sum of the errors calculated by the state estimation remains minimal [42]. There are two approaches for the adversary to commit LRA, immediate and delayed attacking goals. The immediate attacking goal is to maximize the power systems operations cost immediately after the attack whereas the delayed attacking goal is to gradually overload the power lines, while the attack remains undetected and redistributes the load to maximize the operations cost at a certain time after the attack [42]. Yuan et al. [42, 43] developed detection models against LRAs. A related research revealed that an attacker do not need to obtain complete information about the network to execute LRA and remain undetected [44]. A game-theoretic approach is proposed and developed to present an optimal defense strategy against LRAs [45]. Furthermore, the authors in [46] quantified the influence of LRAs by modeling these intrusions as a semi-Markov model.

Denial of service attacks are a class of data intrusion attacks, in which the adversary inserts artificial loads to the service source such that the normal trend of service will be no longer accessible to legitimate requests. The first DoS attack is committed in 1997 by Khan C. Smith during a DEF CON hacking conference, which disrupted access to the internet for more than an hour in the Las Vegas Strip. Distributed denial of service (DDoS) attack is an advanced version of DoS. The DDoS attack is initiated from multiple adversaries/nodes simultaneously such that shutting down one adversary does not stop the attack and further differentiating the legitimate and artificial service requests is next to impossible.

Wang et al. [47] developed a novel method for preventing DoS attacks. This method, called Honeypot Game Model, introduces honeypots in the automated metering infrastructure (AMI) as decoys to gather information about attack and prevent it. Accordingly, an optimal defense strategy will be implemented by analyzing the interaction between the attacker and the defender using the Bayesian-Nash equilibria. Diovu et al. [48] proposed a method for preventing and also mitigating the impacts of DDoS. This method uses a firewall which is leveraged by the cloud computing technology and reduces the data computation and data storing burden of the automated metering infrastructure.

Lu et al. [49] proposed a detection algorithm against DDoS attacks. In this detection method, a pair of probes are being sent from the service source to the service request node. Then, the Fourier-to-Time reconstruction algorithm is executed to verify the legitimacy of the service request based on the gap between the probes. Varalakshmi and Selvi [50] proposed a defense mechanism using an information divergence scheme to detect and discard the adversary’s artificial requests. Srikanthra and Kundur [51] showed that DoS attacks have the potential to disrupt the overall grid even if they are perpetrated on just a subset of cyber communication nodes. They proposed a collaborative reputation-based topology configuration to enable other nodes to converge quickly for maintaining the dynamic stability, while a subset of nodes is under attack. Liu et al. [52] designed a response mechanism to such attacks. They designed a communication subsystem capable of self-healing, when jammed under attack, to mitigate the impacts of the DoS attacks. This subsystem is designed via an intelligent local switching controller. The purpose of this subsystem is to collect sufficient readings from smart meters by local controllers to estimate the state of the system. Furthermore, Clela et al. [53] proposed a defense scheme based on a rule-based feedback control for mitigating the impacts of DoS attacks on islanded microgrids. Liu et al. [52] developed a communication subsystem with the enhanced self-healing ability to respond to cyber attacks, while keeping the system operating with the minimum impact on its service level. Similarly, authors in [51] proposed a relatively similar method for responding to DDoS attacks imposed on a subset of nodes in the system, in which the remaining nodes maintain their dynamic stability and keep the system away from the total failure.

Non-Technical Loss Fraud

Non-technical loss (NTL) fraud, also known as theft attacks, is intended to manipulate the attacker’s consumption data. Theft attacks are less likely to be detected due to its supposedly small impacts comparing to the entire operations of the power grid. However, the financial burden of theft attacks is significantly high. The annual cost of theft attacks is close to 6 billion dollars in the USA [54] and 25 billion dollars worldwide [55].

Pasdar and Mirzakuchaki [56] proposed a detection algorithm in 2007 that sends test signals at high frequency to consumers and calculates the impedance of the related connections. A similar approach along with the real-time tracking of consumers at all times was introduced in [57, 58]. The authors in [59] investigated the theft attack on AMI and proposed a detection method called AMI intrusion detector system (AMIDS). AMIDS tracks both cyber and physical consumption data and meter audit logs to identify the electricity fraud. The authors in [60] proposed a two- stage detection method that clusters high risk consumers and then monitors their consumption profile. Villar-Rodriguez et al. [61] utilized the time series analysis and probabilistic data mining to detect theft attacks. Due to the large scale of the problem, machine learning is extensively used to develop detection algorithms, which monitor the usage profile of the consumers and identify the electricity consumption fraud based on anomalies in the usage patterns [62,63,64,65,66,67].

Time Delay Attacks

Time delay attack, introduced in 2014 by Sargolzaei et al. [68], interferes with the control signal. Receiving the control signal at the right time is of great importance and crucial for controlling the system. A time delay attack simply creates a delay for the control signal to reach the control center. Hence, the control center uses the measurement data of a period ago to control the current performance of the system, which could make the system unstable and prone to damaging attacks. Sargolzaei et al. [69] proposed a prevention method for time delay attacks on load frequency control. Furthermore, Shafique and Iqbal [70] developed a controller for load frequency control based on linear matrix inequalities and utilizing the Lyapnov-Krasovskii functional-based delay-dependent stability criteria. Sargolzaei et al. [71] developed a detection algorithm against time delay attacks in mobile ad hoc networks. The time delay attacks are relatively new and research in this area is still evolving.

Replay Attacks

Replay attacks, also known as Sybil attacks, take advantage of a false identity in the network. Two nodes of the communication network send each other specific signals to verify their identities. Replay attacker remains hidden in the communication network and eavesdrops on the communication channel until the identifying signal is exchanged. The replay attacker takes advantage of the obtained identifier signal from one node to pretend it is a trusted node in the network. This situation is like someone steals a social security number and uses it to mislead credit card companies for issuing a credit card. To the best of our knowledge, this type of attacks mostly has targeted the vehicular networks, sensor networks, and social networks. Due to the interdependence of the smart grid, electric transportation systems, and wireless sensor networks (WSN), we study the Sybil attacks as well.

In 2006, Piro et al. [72] analyzed replay attacks on ad hoc networks and suggested that mobility in the system can be used to enhance the system security rather than being the point of vulnerability. They showed that Sybil attack can be detected even with a single node by having the system nodes passively monitor the traffic. Later in 2008, Lv et al. [73], developed another detection method against Sybil attacks, in which the signal strength sensed by multiple sensors and their distance are utilized for detection. They showed that a Sybil attack happened when two different identities appear to have nearly the same position. Rabieh et al. [74] took another approach for detection of Sybil attacks in vehicular ad hoc network (VANET). In this approach, the attacker’s vehicle, known as Sybil vehicle, claims to have multiple identities. The attacker may use these fake identities for various reasons such as faking the traffic flow. The Sybil attack will be detected since the Sybil vehicle has fake locations and cannot respond to the challenge signal sent by the detection algorithm to the claimed location. Sharma et al. [75] proposed an alternative method for VANETs security against Sybil Attacks and used a generation of dynamic certificates to change the identifying signals dynamically assuming that the adversary does not know the protocol of changing the certificates. Sarigiannidis et al. [76] proposed a rule-based detection system, known as RADS, to monitor and detect Sybil attacks on large-scale WSNs. This approach is based on the ultra-wideband ranging-based detection algorithm. RADS operates in a distributed manner and does not require sharing information between the nodes, which decreases the computational burden and expedites the detection process. More detection algorithms against Sybil attacks have been proposed in literature [77,78,79,80].

Indirect Cyber Attacks to Power Systems

The Internet of things is a system of interrelated computing devices, mechanical and digital machines, objects, animals, or people that are provided with unique identifiers and the ability to transfer data over a network without requiring human-to-human or human-to-computer interactions. The IoT has provided cyber attackers with the opportunity to tamper with the power grid throughout the internet. The IoT attacks on power systems follow two approaches, load altering attacks throughout the direct load control (DLC) programs and targeting the data centers and computational loads [81••]. In load altering attacks, the attacker takes advantage of the dependency of demand side management programs on the internet and compromises the command signals to take over the operation of the residential and industrial load, which are supposed to be controlled by DLC programs. Alternatively, the attacker may hack to a numerous vulnerable consumers’ devices, such as injecting false electricity prices, in order to influence their load behavior [82]. The false command signal or the injected price signal would increase (or decrease) the individual loads of the consumers and abruptly changes the aggregated load [81••]. Aside from the potential financial gains for the attacker and loss for the consumers, the abrupt changes of load may cause severe damages such as circuit overflow, voltage problems, tripping the transmission lines, damages to consumers’ equipment, or even shutting down the power grid temporarily. Amini et al. [83, 84] proposed a dynamic load altering attack, in which the attacker is not only interested in the sudden spike of the aggregated load but also controls the timing of the spike. The main goal of the dynamic load altering attack is for the attacker to monitor the effect of the attack and shrewdly adjust the outcomes of the attack for achieving the maximum damage to the power grid and its operations. The authors developed a detection algorithm against dynamic load altering attacks [85].

Alternatively, the attacker may target only a very selected group of consumers and yet cause spikes on the aggregated load. The consumption of electricity at the IT sector such as Google and Microsoft data centers is growing rapidly. It is expected that the IT sector demand for electricity increases from 2 to 5% of the total consumption in the USA over the next decade [86]. As a case in point, Microsoft’s data center in Quincy, WA, consumes 48 MW, which is the equivalent of 40,000 residential loads [81••]. The notion of cloud computing and selling computation power as utility expedited the growth of the IT sector and therefore their power consumption [87]. The computational load of a data center could change quickly and directly increase its power consumption. This elasticity of data centers’ loads and their direct dependency to the computational loads make data centers an attractive target for power systems attackers. Attackers may use the internet to increase the computational loads of data centers by requesting bogus computational tasks and therefore increase the load of the power grid abruptly.

Discussion and Conclusions

From the HRC perspective, the existing research on physical security of power systems has focused on the prevention and planning stage, while taking into account mitigation of damages and responses to potential attacks. The common shortcoming in this area is the assumption that the protected components will be completely secure and no longer at risk, which limits the application of these models in the real world. Future research needs to address this issue and provide a more reliable solution. The recovery stage of HRC on physical attacks has been barely studied in the literature. On a similar topic, power system recovery after natural disasters has been well studied that could be used as a benchmark for studying the power system recovery after physical attacks.

Our HRC analysis highlights a few concerns on the cyber security of power systems, which should be tackled by researchers in the future. First, the bulk of research on cyber security relates to the prevention mechanisms, outlined in [18•], and developing detection algorithms against the variety of cyber attacks discussed. The other two stages of the HRC perspective, response and recovery, have been barely studied in the literature. The Response and recovery are two major steps after cyber attack detection to mitigate risks and damages and restore the system to its normal operations. The fact that power systems are evolving to smart and autonomous grids puts more emphasis on the importance of the response and recovery for the safe and secure operations of the future power systems. Secondly, the research on cyber security of power systems has followed a micro-level approach that caused a gap and disconnections among the stages of the holistic resilience cycle. As a case in point, most of the proposed detection mechanisms are developed against a certain type of attack for a certain module under certain assumptions. This shortcoming limits the application of these models in the real world and could create more vulnerabilities in the system. It is critical to study cyber problems of power systems systematically, i.e., from the prevention to the recovery, in order to address the problem entirely and refrain cyber attackers from system vulnerability opportunities. Figure 1 summarizes our conclusions schematically.

Fig. 1
figure 1

Holistic resiliency cycle