Keywords

1 Introduction

Modern power systems transmit electricity from generators to users via large-scale transmission and distribution networks. To ensure safe and reliable operation of the system, increasingly more ICT technologies are introduced into the power systems to improve the smartness [1]. However, the introduction of modern communication networks not only facilitates information interaction and wide-area system monitoring, protection and control of power grids but also makes it vulnerable to network invasion [2, 3]. In recent years, cyber-attacks on power grids around the world have been viewed as a principal threat, not just a conceptual one.

For example, Iran’s Blushehr nuclear power plant was attacked by the Stuxnet virus in 2010, which caused the delay of power generation and seriously damaged Iran’s industrial facilities [4]. The transmission lines in Ukraine were continuously tripped in 2015, while the information system was implanted with malicious software, which blocked the system restart [5]. In 2019, several cities in Venezuela including its capital city Caracas plunged into darkness, and power outages affected 21 of the country’s 23 states. According to the media reports, the direct cause of the power failure was a cyber-attack on the country’s largest hydropower station. Soon after, several transformer explosions occurred in the federal district of Caracas, causing another power failure [6].

The power system control center collects measurement data from different power devices and components through the supervisory control and provide instructions back to the system [7]. State estimation is a key functionality in real-time power system monitoring and supervisory control. By analyzing the data collected by the SCADA systems, the current operating state of the power grids can be estimated while bad data and anomalies in the collected measurements can be eliminated.

However, state estimation can be vulnerable to cyber-attack in the open network environment. The false data injection attack (FDIA) against the state estimator in the SCADA system was investigated by Liu et al. in 2009 [8], and it was found that existing bad data detection methods relying on Chi-square detector may not work in response to some false data injection attacks. An experienced attacker can deliberately design the attack vector such that these attacks can bypass the Chi-square detector. Once the sensor is successfully hacked, the tampered measurement will spread in the network, resulting in system performance degradation or even instability [9].

In the research area of false data injection attack, some researchers aim to identify the vulnerability of the system and build the attack models [10,11,12,13], and this helps to improve the understanding of the attack mechanism in order to design a better defense system. For example, a linear spoofing attack strategy and the corresponding feasibility constraints are demonstrated where fake data can be effectively designed to cause system failure [10]. In [11], the potential impact of unobservable attacks is investigated, and the least measurable attack strategy is proposed. Under the fully measurable model and partially measurable model, the existence conditions of unobservable subspace attacks are derived, based on which two attack strategies are proposed in [12]. The first strategy directly affects the system state by hiding attack vectors in the system subspace, and the second strategy misleads the bad data detection mechanism. Meanwhile, other researchers focus on the detection and defense of the system in the presence of attacks [14,15,16,17,18,19]. For example, both active detection and estimation-based detection are proposed in [14]. In the active detection method, a reasonable excitation signal is designed to be superimposed on the control signals, which improves the detectability of attacks on the actuator attack. The other method estimates the value of the attack by using the unknown input observer. In [15], a FDIA attack detection mechanism based on the increments of analytic measurements in the micro-grid environment was proposed.

Most existing researches are based on the analysis of the acquired measurements, but the impact of data communication is not considered. The FDIA in smart grid applications is an attack that reduces the integrity of data acquired by the system. In the existing communication technology, data transmitted through the network is often in the form of packets [20]. Most existing approaches construct the attack model and detect the attack using acquired measurements and the estimation of measurements [10,11,12,13,14,15,16,17,18]. However, in addition to potential FDIA, the data transmitted through the network may also be affected by network characteristics such as data losses during the transmission phase. This paper investigates the data injection attack on power system state estimation considering data losses in communication. The main contributions are as follows:

  • A DC (direct current) model of the system under data injection attack is deduced, taking into account the random packet losses.

  • The mechanism of weighted least squares state estimation and bad data detection are analyzed and an undetected range of attack vectors is derived.

  • Based on the established DC measurement model, the mean square error matrix of state estimation under the FDIA is analyzed.

The remainder of the paper is organized as follows. The transmission model of sensor measurements in the power grids under random packet loss is discussed in Sect. 2. Section 3 analyses the effects of random packet loss and data injection attacks on weighted least squares estimation, and the range of attack vectors is also studied. Simulation results are presented in Sect. 4, and the weighted least squares state estimation results under three different cases are compared.

2 Problem Formulation

2.1 Data Transmission Model

The SCADA system in the power grids collects sampled measurements from sensors through the communication network. However, due to limitations of the communication technology, data may get lost during the transmission. Figure 1 illustrates the whole process from data sampling and transmission to state estimation.

Fig. 1.
figure 1

The data sampling, transmission and state estimation process.

As shown in Fig. 1, at time instant \({{t}_{k-\text {1}}}\), the measurement device samples and transmit the sensor measurements to the network in the form of packets. Due to network induced delays, after the transmission delay \({{d}_{k-1}}\), the SCADA system will receive the sampled measurements at time instant \({{t}_{k-\text {1}}}+{{d}_{k-1}}\). Further, some data may be lost during the transmission process, such as the data at time instant \({{t}_{k}}\) shown in Fig. 1. Once the SCADA system obtains the measurements, the estimator can receive the data after the computing time delay of \({{c}_{k-1}}\). Power grids are typical complex cyber-physical systems with numerous sensors, and all sensor data will go through the similar process as shown in Fig. 1 when they are transmitted to the SCADA system.

Define the measurements received by the SCADA system at sampling instant k as \({{z}_{k}}\), \({{z}_{k}}\in {{R}^{m}}\) , and if there exists data packet losses, two popular compensated methods are often adopted. One is to directly replace the lost data with 0 [21]. Another is to replace the lost data with the previous sampled data. This paper adopts the first method, i.e., the loss packet is set as 0. For random packet losses, the received measurements can be expressed by

$$\begin{aligned} {{z}_{lk}}={{\lambda }_{k}}{{z}_{k}}, \end{aligned}$$
(1)

where \({{\lambda }_{k}}\in {{R}^{m\times m}}\) is a diagonal matrix whose diagonal elements are either 1 or 0. When a measurement is lost, its corresponding value is set to 0.

2.2 Power Grid Measurement Model

When the system is subject to a false data injection attack, the measurement process of the grids is shown in Fig. 2. When a sensor device samples measurements, it may be invaded by an attacker by deception, and false data are injected. Next, the sensor transmits the corrupted data to the SCADA over the network. When random packet loss is not considered at the sampling instant k, the AC measurement model can be described as

$$\begin{aligned} {{z}_{k}}=h({{x}_{k}})+{{v}_{k}}, \end{aligned}$$
(2)

where \({{z}_{k}}\) is denoted as the measurement vector, \({{x}_{k}}\) is the system state vector, \({{v}_{k}}\) is the Gaussian measurement noise, and \(h({{x}_{k}})\) is the functional dependency between measurements and state variables.

Fig. 2.
figure 2

The power grid measurement process subject to false data injection attack.

If the ground admittance and branch conductance are ignored and assume that the voltage phase difference between two nodes is negligible, the voltage amplitude of the nodes is close to unit quantity 1. The DC measurement model can be used to approximate AC measurement model. The DC measurement model can be expressed as

$$\begin{aligned} {{z}_{k}}={{H}}{{x}_{k}}+{{v}_{k}}, \end{aligned}$$
(3)

where H is the steady-state functional dependency between measurements and state variables.

When only random packet loss is considered, at the sampling instant k, the DC measurement model can be expressed as

$$\begin{aligned} {{z}_{lk}}={{\lambda }_{k}}(H{{x}_{k}}+{{v}_{k}}). \end{aligned}$$
(4)

When only a data injection attack is considered and assume that the injected value is \({{a}_{k}}\)and \({{a}_{k}}\in {{R}^{m}}\). If \({{a}_{k}}\) is nonzero, the corresponding measurement is tampered. Then the measurement contains the attack vector \({{a}_{k}}\), which can be expressed as

$$\begin{aligned} {{z}_{ak}}={{z}_{k}}+{{a}_{k}}, \end{aligned}$$
(5)

where \({{a}_{k}}\) is the attack vector injected to measurement.

When random packet loss is considered, the measurement function can be expressed as

$$\begin{aligned} {{z}_{lak}}={{\lambda }_{k}}{{z}_{ak}}. \end{aligned}$$
(6)

Equation (6) is the measurement model under the false data injection attack which considers both the influence of random packet loss and data injection attack on the measurements of the grid.

3 Analysis of Weighted Least Squares Estimation

State estimation is used for monitoring the operating state of the grid and remove bad data, and the weighted least square method is a popular state estimation method. The false data injection attack aims to mislead the state estimation, and it is necessary to have a detailed analysis of the state estimator. According to the weighted least squares estimation, the objective function can be expressed as

$$\begin{aligned} \min J\left( {{x}_{k}} \right) ={{({{z}_{k}}-H{{x}_{k}})}^{T}}W({{z}_{k}}-H{{x}_{k}}), \end{aligned}$$
(7)

where W is the weighted matrix. The estimation of the system state can be expressed as

$$\begin{aligned} {{\hat{x}}_{k}}={{({{H}^{T}}W{{H}^{T}})}^{-1}}{{H}^{T}}W{{z}_{k}}. \end{aligned}$$
(8)

Define \({{\hat{z}}_{k}}=H{{\hat{x}}_{k}}\) as the state estimation of the system, and the residual between the real and the measurement estimation is defined as \({{r}_{k}}\), and \({{r}_{k}}\) can be expressed as

$$\begin{aligned} {{r}_{k}}={{z}_{k}}-{{\hat{z}}_{k}}. \end{aligned}$$
(9)

According to the Chi-square detector, 2-norm of the residual must be less than the threshold to consider that there is no bad data, i.e.,

$$\begin{aligned} {{\left\| {{r}_{k}} \right\| }_{2}}\le \tau , \end{aligned}$$
(10)

where \(\tau \) is the threshold of the Chi-square detector, which can be obtained by checking the Chi-square distribution table. When there is only a false data injection attack, the injected increment must meet certain conditions in order not to be detected. According to (8), for a given \({{a}_{k}}\), the state estimation can be expressed as

$$\begin{aligned} {{\hat{x}}_{ak}}={{({{H}^{T}}W{{H}^{T}})}^{-1}}{{H}^{T}}W{{z}_{ak}}, \end{aligned}$$
(11)

where \({{\hat{x}}_{ak}}\) the corrupted estimation due to FDIA. The estimate of the measurement is \({{\hat{z}}_{ak}}=H{{\hat{x}}_{ak}}\), and the residuals can be expressed as

$$\begin{aligned} \begin{array}{l} {{r}_{ak}}={{z}_{a}}_{k}-{{{\hat{z}}}_{ak}}={{z}_{k}}+{{a}_{k}}-(H{{{\hat{x}}}_{k}}+H{{\left( {{H}^{T}}WH \right) }^{-1}}{{H}^{T}}W{{a}_{k}}) \\ =(I-H{{\left( {{H}^{T}}WH \right) }^{-1}}{{H}^{T}}W)({{z}_{k}}+{{a}_{k}}). \\ \end{array} \end{aligned}$$
(12)

To evade the detector, Eq. (13) must be satisfied, that is

$$\begin{aligned} {{\left\| {{r}_{ak}} \right\| }_{2}}\le \tau . \end{aligned}$$
(13)

Let \(B=(I-H{{\left( {{H}^{T}}WH \right) }^{-1}}{{H}^{T}}W)\), Eq. (13) can be re-written as

$$\begin{aligned} {{\left\| B({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}\le \tau . \end{aligned}$$
(14)

According to the compatibility

$$\begin{aligned} {{\left\| B({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}\le {{\left\| B \right\| }_{2}}{{\left\| ({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}, \end{aligned}$$
(15)

when \({{\left\| B \right\| }_{2}}{{\left\| ({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}\le \tau \) hold, the Eq. (14) will be hold, where \({{\left\| B \right\| }_{2}}=\sqrt{{{\eta }_{\max }}({{B}^{T}}B)}\) is the induced norm and \({{\eta }_{\max }}({{B}^{T}}B)\) is the maximum eigenvalue of the matrix \({{B}^{T}}B\).

Therefore,

$$\begin{aligned} {{\left\| ({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}\le \frac{\tau }{{{\left\| B \right\| }_{2}}}. \end{aligned}$$
(16)

Remark 1

Inequality (16) represents a subset of the attack vector which will not trigger an alarm from the bad data detector.

Corollary 1

Equation (17) is the non-detectable spoofing range of the attack vector.

$$\begin{aligned} {{\left\| {{a}_{k}} \right\| }_{2}}\le \frac{\tau }{{{\left\| B \right\| }_{2}}}-{{\left\| {{z}_{k}} \right\| }_{2}}. \end{aligned}$$
(17)

According to the triangle inequality, it’s easy to prove Corollary 1 is true. The specific derivation is given as follows.

According to the triangle inequality of vector 2-norm,

$$\begin{aligned} {{\left\| ({{z}_{k}}+{{a}_{k}}) \right\| }_{2}}\le {{\left\| {{z}_{k}} \right\| }_{2}}\text {+}{{\left\| {{a}_{k}} \right\| }_{2}}, \end{aligned}$$
(18)

when \({{\left\| {{z}_{k}} \right\| }_{2}}\text {+}{{\left\| {{a}_{k}} \right\| }_{2}}\le \frac{\tau }{{{\left\| B \right\| }_{2}}}\) hold, the Eq. (16) will be hold. So Eq. (17) is a safe range of the attack vector.

When packets are randomly lost, the integrity of the collected data by SCADA is destroyed. However, due to the redundancy of data in data acquisition of the power grids, the effect of the loss of a small number of measurements may small. To study the effect of data injection attack on the performance of state estimation under random packet losses, the mean square error (MSE) of weighted least squares state estimation under random packet losses is derived.

Suppose that the state vector \({{x}_{k}}\), the attack vector \({{a}_{k}}\), and the noise \({{v}_{k}}\) obey the Gaussian distribution where the mean value is \({{\mu }_{{{x}_{k}}}}=0\), and the variance is \({{R}_{{{x}_{k}}}}\), \({{R}_{{{a}_{k}}}}\), \({{R}_{v}}\). When there is random packet loss, the measurement model of the system is shown by Eq. (6). Combined Eq. (11) with Eq. (6), the state estimation of the system can be expressed as

$$\begin{aligned} \begin{array}{l} {{{\hat{x}}}_{lak}}={{({{({{\lambda }_{k}}H)}^{T}}W{{\lambda }_{k}}H)}^{-1}}{{({{\lambda }_{k}}H)}^{T}}W({{z}_{k}}+{{a}_{k}})\\ ={{({{H}^{T}}{{\lambda }_{k}}W{{H}})}^{-1}}{{H}^{T}}{{\lambda }_{k}}W({{z}_{k}}+{{a}_{k}}). \end{array} \end{aligned}$$
(19)

When the system state estimation residual is defined as \({{\varepsilon }_{{{x}_{k}}}}={{\hat{x}}_{lak}}-{{x}_{k}}\), \({{\varepsilon }_{{{x}_{k}}}}\) can be expressed as

$$\begin{aligned} \begin{array}{l} {{\varepsilon }_{{{x}_{k}}}}={{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}}{{H}^{T}}{{\lambda }_{k}}W({{z}_{k}}+{{a}_{k}})-{{x}_{k}} \\ ={{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}}{{H}^{T}}{{\lambda }_{k}}W(H{{x}_{k}}+{{v}_{k}}+{{a}_{k}})-{{x}_{k}} \\ ={{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}}{{H}^{T}}{{\lambda }_{k}}W({{v}_{k}}+{{a}_{k}}) \\ \end{array} \end{aligned}$$
(20)

When there is random packet losses and data injection attack, the mean square error matrix of system state estimation is

$$\begin{aligned} \begin{array}{l} {{R}_{{{\varepsilon }_{{{x}_{k}}}}}}=E\{{{\varepsilon }_{{{x}_{k}}}}\varepsilon _{{{x}_{k}}}^{T}\}={{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}} \\ +{{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}}{{H}^{T}}{{\lambda }_{k}}W{{R}_{{{a}_{k}}}}{{\lambda }_{k}}WH{{({{H}^{T}}{{\lambda }_{k}}WH)}^{-1}}. \end{array} \end{aligned}$$
(21)

Let \({{B}_{k}}=({{H}^{T}}{{\lambda }_{k}}WH)\), \({{R}_{{{\varepsilon }_{{{x}_{k}}}}}}\) can be expressed as

$$\begin{aligned} {{R}_{{{\varepsilon }_{{{x}_{k}}}}}}={{B}_{k}}^{-1}+{{B}_{k}}^{-1}{{H}^{T}}{{\lambda }_{k}}W{{R}_{{{a}_{k}}}}{{\lambda }_{k}}WH{{B}_{k}}^{-1}. \end{aligned}$$
(22)

Ideally, when there is no packet losses and data injection attacks, \({{\lambda }_{k}}=I\), \({{a}_{k}}=0\). Then the mean square error matrix of the weighted least squares state estimation is

$$\begin{aligned} {{R}_{{{\varepsilon }_{{{x}_{k}}}}}}={{({{H}^{T}}WH)}^{-1}}. \end{aligned}$$
(23)

Comparing Eq. (22) and (23), it can be found that the existence of random packet losses will not only affect the state estimation, but also affect the effect of data injection attack.

4 Simulation Study

To assess the impact of data injection attack under random packet losses on smart grid state estimation, IEEE-14 node system is used in the simulation experiments, as shown in Fig. 3. IEEE-14 node system has 54 measurements, where 1–14 are the measurements of the active power of the bus, 15–34 are the measurements of branch power of the incoming node, and 35–54 are the measurements of branch power of the outgoing node. Assuming that the noise of each measurement obeys the Gaussian distribution, i.e., \({{v}_{i}}\tilde{\ }N(0,{{0.02}^{2}})\), where \(i=1,2,\cdots ,54\). Considering the phase angle of the reference bus \({{\delta }_{1}}=0\), it is only necessary to estimate the state quantity of the other 13 nodes, and \(H\in {{R}^{54\times 13}}\).

Fig. 3.
figure 3

The power grids measurement process.

Firstly, node 1 is selected as the reference node, and the state truth value and the measurement truth value are obtained by 100 power flow calculations. It is assumed that the white noise obeys the Gaussian distribution \((0,{{0.02}^{2}})\) and the measurement error covariance matrix is constant.

Performance Index: From Eq. (23) under ideal conditions, when there is no data injection attack and transmission packet losses, the mean square error matrix of system state estimation is \({{R}_{{{\varepsilon }_{{{x}_{k}}}}}}={{({{H}^{T}}WH)}^{-1}}\). In order to measure the state estimation performance, Eq. (24) is used as the performance index.

$$\begin{aligned} \text {Performance}=\frac{{{\left\| (xreal-\hat{x})(xreal-\hat{x})' \right\| }_{F}}}{{{\left\| {{({{H}^{T}}WH)}^{-1}} \right\| }_{F}}}, \end{aligned}$$
(24)

where xreal is system status truth value, \(\hat{x}\) is the estimation, and \({{\left\| {} \right\| }_{F}}\) is frobenius norm of matrix.

When there only exist data injection attacks, while Eq. (17) is satisfied, three different attacks are randomly selected, where one measurement is tampered in \({{a}_{88}}\), five measurements are tampered in \({{a}_{41}}\), and ten measurements tampered in \({{a}_{55}}\). The dimension of each non-zero in the attack vector was randomly selected in \([-(\frac{\tau }{{{\left\| B \right\| }_{2}}}-{{\left\| {{z}_{k}} \right\| }_{2}})/p,(\frac{\tau }{{{\left\| B \right\| }_{2}}}-{{\left\| {{z}_{k}} \right\| }_{2}})/p]\), where p is the number of the tampered devices. The details of the attack vector are shown in Table 1. The estimated results are also illustrated in Fig. 4.

Table 1. Details of the attack vector in data injection attack only
Fig. 4.
figure 4

State estimation under only data injection attack.

According to Fig. 4, the data injection attack has a great impact on the survivability of system state estimation. However, with the increase of attack dimensions, the impact of the attack on the estimation decreases gradually if the attack vector remains non-detectable by satisfying Eq. (16).

Fig. 5.
figure 5

Estimation performance under only data injection attack.

The performance index is also illustrated in Fig. 5. This is a result from the attack vector limited by Eq. (17). The more dimensions of the attack, the lower the amplitude of each dimension in the attack vector will become.

In the packet loss only scenario, three packet loss rates are randomly selected, which are 2%, 5% and 10% respectively. The specific information of random packet losses is shown in Table 2, and the estimation results are illustrated in Fig. 6.

Table 2. Details of the packet loss due to random packet loss only

As show in Figs. 6 and 7, a small amount of random data packet loss in the data transmission of the sensor does not have significant impact on the system state estimation. This is due to the existence of the measurement redundancy of the power system, which guarantees the safety and reliability of power system state estimation.

Fig. 6.
figure 6

State estimation under random packet loss scenario.

Fig. 7.
figure 7

Estimation performance in random packet loss only scenario.

Furthermore, comparing Fig. 4 with Fig. 6, it is clear that the data injection attack has a greater impact on the system state estimation. Again, the performance indexes as shown in Figs. 5 and 7 are not in the same order of magnitude.

When both packet losses and data injection attacks are presented, 5% packet loss rate and 5 dimensions attacked are simulated.

Table 3. The details of the packet loss and attack vectors
Fig. 8.
figure 8

State estimation under both random packet loss and data injection attack.

Three scenarios, including random packet losses and data injection attack are not coincidences, some coincident, and all occurred coincidently are analyzed. The specific information of random packet loss and attack vectors are listed in Table 3, and the estimation results are illustrated in Fig. 8. It can be seen that notification of data injection attack and random packet loss will have a great impact on system state estimation results.

Fig. 9.
figure 9

Estimation performance under both random packet loss and data injection attack.

As shown in Fig. 9, when the random packet losses occur coincidently with the attack, and the estimation performance is better than the non-overlap, but the impact of the attack vector itself is greater.

5 Conclusions

This paper has analyzed the impact of false data injection attacks on smart grid state estimation under random packet losses. Firstly, the measurement model of power grid under random packet losses is established, and an attack vector range that can escape the detector is derived. Then, the weighted least squares estimation is analyzed, and a non-detectable range of attack vectors in the data injection attack is derived. It is proved that as long as the attack vectors are selected in the derived range, the existing “bad data” detection device will not respond. Further, considering the false data injection attack, the mean square error matrix of the weighted least squares estimation is provided. Finally, simulation experiments on a IEEE-14 node system is used to compare the effects of data injection attack, random packet loss, and simultaneous random packet loss and data injection attack on the system state estimation.