3.1 Introduction

Security issues such as the presence of malicious attacks could cause severe consequences in cyber-physical systems, which are safety-critical in most cases since they are interacting with the physical world. In the trend that cyber-physical systems are becoming more and more prevalent nowadays, it is also increasingly critical to be fully aware of such systems’ performance limits (Fang et al. 2017), e.g., in terms of performance degradation, after taking the security issues into consideration. Accordingly, in this chapter, we focus on analyzing the fundamental limits of resilience in cyber-physical systems, including open-loop dynamical systems and (closed-loop) feedback control systems. More specifically, we examine the fundamental trade-offs between the systems’ performance degradation that can be brought about by a malicious attack and the possibility of it being detected, of which the former is oftentimes measured by the mean squared-error distortion , whereas the latter is fundamentally determined by the Kullback–Leibler (KL) divergence.

The KL divergence was proposed in Kullback and Leibler (1951) (see also Kullback (1997)), and ever since it has been employed in various research areas, including, e.g., information theory (Cover and Thomas 2006), signal processing (Kay 2020), statistics (Pardo 2006), control and estimation theory (Lindquist and Picci 2015), system identification (Stoorvogel and Van Schuppen 1996), and machine learning (Goodfellow et al. 2016). Particularly, in statistical detection theory (Poor 2013), KL divergence provides the optimal exponent in probability of error for binary hypotheses testing problems as a result of the Chernoff–Stein lemma (Cover and Thomas 2006). Accordingly, in the context of determining whether an attack signal is present or not in security problems, the KL divergence has also been employed as a measure of stealthiness for attacks (see detailed discussions in, e.g., Bai et al. (2017a, b)).

In the context of dynamical and control system security (see, e.g., Poovendran et al. (2012), Johansson et al. (2014), Sandberg et al. (2015), Cheng et al. (2017), Giraldo et al. (2018), Weerakkody et al. (2019), Dibaji et al. (2019), Chong et al. (2019) and the references therein), particularly in dynamical and control systems under injection attacks, fundamental stealthiness–distortion trade-offs (with the mean squared-error as the distortion measure and the KL divergence as the stealthiness measure) have been investigated for feedback control systems (see, e.g., Zhang and Venkitasubramaniam (2017), Bai et al. (2017b)) as well as state estimation systems (see, e.g., Bai et al. (2017a), Kung et al. (2016), Guo et al. (2018)). Generally speaking, the problem considered is: Given a constraint (upper bound) on the level of stealthiness, what is the maximum degree of distortion (for control or for estimation) that can be caused by the attacker? This is dual to the following question: Given a least requirement (lower bound) on the degree of distortion, what is the maximum level of stealthiness that can be achieved by the attacker? Answers to these questions can not only capture the fundamental trade-offs between stealthiness and distortion but also characterize what the worst-case attacks are.

In this chapter, unlike the aforementioned works in Bai et al. (2017a, b), Kung et al. (2016), Zhang and Venkitasubramaniam (2017), Guo et al. (2018), we adopt an alternative approach to this stealthiness–distortion trade-off problem using power spectral analysis. The scenarios we consider include linear Gaussian open-loop dynamical systems and (closed-loop) feedback control systems. By using the power spectral approach, we obtain explicit formulas that characterize analytically the stealthiness–distortion trade-offs as well as the properties of the worst-case attacks. It turns out that the worst-case attacks are stationary colored Gaussian attacks with power spectra that are shaped specifically according to the transfer functions of the systems and the power spectra of the system outputs, the knowledge of which is all that the attacker needs to have access to in order to carry out the worst-case attacks. In other words, the attacker only needs to know the input–output behaviors of the systems, whereas it is not necessary to know their state-space models.

The remainder of the chapter is organized as follows. Section 3.2 provides the technical preliminaries. Section 3.3 is divided into two subsections, focusing on open-loop dynamical systems and feedback control systems, respectively. Section 3.4 presents numerical examples. Concluding remarks are given in Sect. 3.5.

More specifically, Theorem 3.1, as the first main result, characterizes explicitly the stealthiness–distortion trade-off and the worst-case attack in linear Gaussian open-loop dynamical systems. Equivalently, Corollary 3.1 considers the dual problem to that of Theorem 3.1. On the other hand, Theorem 3.2, together with Corollary 3.2 (in a dual manner), provides analytical expressions for the stealthiness–distortion trade-off and the worst-case attack in linear Gaussian feedback control systems. In addition, the preliminary results on the implications in control design, as presented in the Conclusion, indicate how the explicit stealthiness–distortion trade-off formula for feedback control systems can be employed to render the controller design explicit and intuitive.

Note that this chapter is based upon (Fang and Zhu 2021), which, however, only discusses the case of open-loop dynamical systems. Meanwhile, in this chapter, we also consider (closed-loop) feedback control systems. Note also that the results presented in this book chapter are applicable to discrete-time systems.

Notation: Throughout the chapter, we consider zero-mean real-valued continuous random variables and random vectors, as well as discrete-time stochastic processes. We represent random variables and random vectors using boldface letters, e.g., \(\mathbf {x}\), while the probability density function of \(\mathbf {x}\) is denoted as \(p_\mathbf {x}\). In addition, \(\mathbf {x}_{0,\ldots ,k}\) will be employed to denote the sequence \(\mathbf {x}_{0}, \ldots , \mathbf {x}_{k}\) or the random vector \(\left[ \mathbf {x}_0^{\mathrm {T}},\ldots ,\mathbf {x}_{k}^{\mathrm {T}} \right] ^{\mathrm {T}}\), depending on the context. Note in particular that, for simplicity and with abuse of notations, we utilize \(\mathbf {x} \in \mathbb {R}\) and \(\mathbf {x} \in \mathbb {R}^m\) to indicate that \(\mathbf {x}\) is a real-valued random variable and that \(\mathbf {x}\) is a real-valued m-dimensional random vector, respectively.

3.2 Preliminaries

A stochastic process \(\left\{ \mathbf {x}_{k}\right\} , \mathbf {x}_k \in \mathbb {R}\) is said to be stationary if \( R_{\mathbf {x}}\left( i,k\right) := \mathbb {E}\left[ \mathbf {x}_i \mathbf {x}_{i+k} \right] \) depends only on k, and can thus be denoted as \(R_{\mathbf {x}}\left( k\right) \) for simplicity. The power spectrum of a stationary process \(\left\{ \mathbf {x}_{k} \right\} , \mathbf {x}_{k} \in \mathbb {R}\) is defined as

$$\begin{aligned} S_{\mathbf {x}}\left( \omega \right) := \sum _{k=-\infty }^{\infty } R_{\mathbf {x}}\left( k\right) \mathrm {e}^{-\mathrm {j}\omega k}. \nonumber \end{aligned}$$

Moreover, the variance of \(\left\{ \mathbf {x}_{k}\right\} \) is given by

$$\begin{aligned} \sigma _{\mathbf {x}}^2 = \mathbb {E}\left[ \mathbf {x}_k^2 \right] = \frac{1}{2\pi }\int _{-\pi }^{\pi } S_{\mathbf {x}}\left( \omega \right) \mathrm {d} \omega . \nonumber \end{aligned}$$

The KL divergence (see, e.g., Kullback and Leibler (1951)) is defined as follows.

Definition 3.1

Consider random vectors \(\mathbf {x} \in \mathbb {R}^m\) and \(\mathbf {y} \in \mathbb {R}^m\) with probability densities \(p_\mathbf {x} \left( \mathbf {u} \right) \) and \(p_\mathbf {y} \left( \mathbf {u} \right) \), respectively. The KL divergence from distribution \(p_\mathbf {x}\) to distribution \(p_\mathbf {y}\) is defined as

$$\begin{aligned} \mathrm {KL} \left( p_{\mathbf {y}} \Vert p_{\mathbf {x}} \right) := \int p_{\mathbf {y}} \left( \mathbf {u} \right) \ln \frac{p_{\mathbf {y}} \left( \mathbf {u} \right) }{ p_{ \mathbf {x} } \left( \mathbf {u} \right) } \mathrm {d} \mathbf {u}. \nonumber \end{aligned}$$

The next lemma (see, e.g., Kay (2020)) provides an explicit expression of KL divergence in terms of covariance matrices for Gaussian random vectors; note that herein and in the sequel, all random variables and random vectors are assumed to be zero mean.

Lemma 3.1

Consider Gaussian random vectors \(\mathbf {x} \in \mathbb {R}^m\) and \(\mathbf {y} \in \mathbb {R}^m\) with covariance matrices \(\varSigma _\mathbf {x}\) and \(\varSigma _\mathbf {y}\), respectively. The KL divergence from distribution \(p_\mathbf {x}\) to distribution \(p_\mathbf {y}\) is given by

$$\begin{aligned} \mathrm {KL} \left( p_{\mathbf {y}} \Vert p_{\mathbf {x}} \right) = \frac{1}{2} \left[ \text {tr}\left( \varSigma _{\mathbf {y}} \varSigma _{\mathbf {x}}^{-1} \right) - \ln \det \left( \varSigma _{\mathbf {y}} \varSigma _{\mathbf {x}}^{-1} \right) - m \right] . \nonumber \end{aligned}$$

It is clear that in the scalar case (when \(m=1\)), Lemma 3.1 reduces to the following formula for Gaussian random variables:

$$\begin{aligned} \mathrm {KL} \left( p_{\mathbf {y} } \Vert p_{\mathbf {x} } \right) = \frac{1}{2} \left[ \frac{ \sigma _{\mathbf {y} }^2}{\sigma _{\mathbf {x} }^2} - \ln \left( \frac{ \sigma _{\mathbf {y} }^2 }{\sigma _{\mathbf {x} }^2} \right) - 1 \right] . \nonumber \end{aligned}$$

The KL divergence rate (see, e.g., Lindquist and Picci (2015)) is defined as follows.

Definition 3.2

Consider stochastic processes \(\left\{ \mathbf {x}_k \right\} , \mathbf {x}_k \in \mathbb {R}^m\) and \(\left\{ \mathbf {y}_k \right\} , \mathbf {y}_k \in \mathbb {R}^m\) with densities \(p_\mathbf {\left\{ \mathbf {x}_k \right\} }\) and \(p_\mathbf {\left\{ \mathbf {y}_k \right\} }\), respectively; note that \(p_\mathbf {\left\{ \mathbf {x}_k \right\} }\) and \(p_\mathbf {\left\{ \mathbf {y}_k \right\} }\) will be denoted by \(p_\mathbf {x}\) and \(p_\mathbf {y}\) for simplicity in the sequel. Then, the KL divergence rate from distribution \(p_\mathbf {x}\) to distribution \(p_\mathbf {y}\) is defined as

$$\begin{aligned} \mathrm {KL}_{\infty } \left( p_{\mathbf {y}} \Vert p_{\mathbf {x}} \right) := \limsup _{k \rightarrow \infty } \frac{\mathrm {KL} \left( p_{\mathbf {y}_{0, \ldots , k}} \Vert p_{\mathbf {x}_{0, \ldots , k}} \right) }{k+1}. \nonumber \end{aligned}$$

The next lemma (see, e.g., Lindquist and Picci (2015)) provides an explicit expression of KL divergence rate in terms of power spectra for stationary Gaussian processes.

Lemma 3.2

Consider stationary Gaussian processes \(\left\{ \mathbf {x}_k \right\} , \mathbf {x}_k \in \mathbb {R}\) and \(\left\{ \mathbf {y}_k \right\} , \mathbf {y}_k \in \mathbb {R}\) with densities \(p_\mathbf {x}\) and \(p_\mathbf {y}\) as well as power spectra \(S_{\mathbf {x}} \left( \omega \right) \) and \(S_{\mathbf {y}} \left( \omega \right) \), respectively. Suppose that \(S_{\mathbf {y}} \left( \omega \right) / S_{\mathbf {x}} \left( \omega \right) \) is bounded (see Lindquist and Picci (2015) for details). Then, the KL divergence rate from distribution \(p_\mathbf {x}\) to distribution \(p_\mathbf {y}\) is given by

$$\begin{aligned} \mathrm {KL}_{\infty } \left( p_{\mathbf {y}} \Vert p_{\mathbf {x}} \right) = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{S_{\mathbf {y}} \left( \omega \right) }{S_{\mathbf {x}} \left( \omega \right) } - \ln \left[ \frac{S_{\mathbf {y}} \left( \omega \right) }{S_{\mathbf {x}} \left( \omega \right) } \right] - 1 \right\} \mathrm {d} \omega . \end{aligned}$$
(3.1)

3.3 Stealthiness–Distortion Trade-Offs and Worst-Case Attacks

In this section, we analyze the fundamental stealthiness–distortion trade-offs of linear Gaussian open-loop dynamical systems and (closed-loop) feedback control systems under data injection attacks, whereas the KL divergence is employed as the stealthiness measure. Consider the scenario where attacker can modify the system input, and consequently, the system state and system output will then all be changed. From the attacker’s point of view, the desired outcome is that the change in system state (as measured by state distortion) is large, while the change in system output (as measured by output stealthiness) is relatively small, so as to make the possibility of being detected low. Meanwhile fundamental trade-offs in general exist between state distortion and output stealthiness, since the system’s state and output are correlated. In other words, increase in state distortion may inevitably lead to decrease in output stealthiness, i.e., increase in the possibility of being detected. How to capture such trade-offs? And what is the worst-case attack that can cause the maximum distortion given a certain stealthiness level, or vice versa? The answers are provided subsequently in terms of power spectral analysis.

3.3.1 Open-Loop Dynamical Systems

In this subsection, we focus on open-loop dynamical systems. Specifically, consider the scalar dynamical system depicted in Fig. 3.1 with state-space model given by

$$\begin{aligned} \left\{ \begin{array}{rcl} \mathbf {x}_{k+1} &{} = &{} a \mathbf {x}_{k} + b \mathbf {u}_{k} + \mathbf {w}_k,\\ \mathbf {y}_{k} &{} = &{} c \mathbf {x}_{k} +\mathbf {v}_k, \end{array} \right. \nonumber \end{aligned}$$

where \(\mathbf {x}_{k} \in \mathbb {R}\) is the system state, \(\mathbf {u}_{k} \in \mathbb {R}\) is the system input, \(\mathbf {y}_{k} \in \mathbb {R}\) is the system output, \(\mathbf {w}_{k} \in \mathbb {R}\) is the process noise, and \(\mathbf {v}_{k} \in \mathbb {R}\) is the measurement noise. The system parameters are \( a \in \mathbb {R}\), \( b \in \mathbb {R}\), and \( c \in \mathbb {R}\); we further assume that \(\left| a \right| < 1\) and \(b, c \ne 0\), i.e., the system is stable, controllable, and observable. Accordingly, the transfer function of the system is given by

$$\begin{aligned} P \left( z \right) = \frac{bc}{z - a}. \end{aligned}$$
(3.2)

(It is clear that \(P \left( z \right) \) is minimum phase.) Suppose that \(\left\{ \mathbf {w}_{k} \right\} \) and \(\left\{ \mathbf {v}_{k} \right\} \) are stationary white Gaussian with variances \(\sigma _{\mathbf {w}}^2\) and \(\sigma _{\mathbf {v}}^2\), respectively. Furthermore, \(\left\{ \mathbf {w}_{k} \right\} \), \(\left\{ \mathbf {v}_{k} \right\} \), and \(\mathbf {x}_{0}\) are assumed to be mutually independent. Assume also that \(\left\{ \mathbf {u}_{k} \right\} \) is stationary with power spectrum \(S_{\mathbf {u}} \left( \omega \right) \). As such, \(\left\{ \mathbf {x}_{k} \right\} \) and \(\left\{ \mathbf {y}_{k} \right\} \) are both stationary, and denote their power spectra by \(S_{\mathbf {x}} \left( \omega \right) \) and \(S_{\mathbf {y}} \left( \omega \right) \), respectively.

Fig. 3.1
figure 1

A dynamical system

Consider then the scenario that an attack signal \(\left\{ \mathbf {n}_{k} \right\} , \mathbf {n}_{k} \in \mathbb {R}\), is to be added to the input of the system \(\left\{ \mathbf {u}_{k} \right\} \) to deviate the system state, while aiming to be stealthy in the system output; see the depiction in Fig. 3.2. In addition, denote the true plant input under attack as \(\left\{ \widehat{\mathbf {u}}_{k} \right\} \), where

$$\begin{aligned} \widehat{\mathbf {u}}_{k} = \mathbf {u}_{k} + \mathbf {n}_{k}, \end{aligned}$$
(3.3)

whereas the system under attack \(\left\{ \mathbf {n}_{k} \right\} \) is given by

$$\begin{aligned} \left\{ \begin{array}{rcl} \widehat{\mathbf {x}}_{k+1} &{} = &{} a \widehat{\mathbf {x}}_{k} + b \widehat{\mathbf {u}}_{k} + \mathbf {w}_k = a \widehat{\mathbf {x}}_{k} + b \mathbf {u}_{k} + b \mathbf {n}_{k} + \mathbf {w}_k,\\ \widehat{\mathbf {y}}_{k} &{} = &{} c \widehat{\mathbf {x}}_{k} +\mathbf {v}_k. \end{array} \right. \end{aligned}$$
(3.4)

Meanwhile, suppose that the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) is independent of \(\left\{ \mathbf {u}_{k} \right\} \), \(\left\{ \mathbf {w}_{k} \right\} \), \(\left\{ \mathbf {v}_{k} \right\} \), and \(\mathbf {x}_{0}\); consequently, \(\left\{ \mathbf {n}_{k} \right\} \) is independent of \(\left\{ \mathbf {x}_{k} \right\} \) and \(\left\{ \mathbf {y}_{k} \right\} \) as well.

The following questions then naturally arise: What is the fundamental trade-off between the degree of distortion caused in the system state (as measured by the mean squared-error distortion \(\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \) between the original state \(\left\{ \mathbf {x}_{k} \right\} \) and the state under attack denoted as \(\left\{ \widehat{\mathbf {x}}_k \right\} \)) and the level of stealthiness resulted in the system output (as measured by the KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) between the original output \(\left\{ \mathbf {y}_{k} \right\} \) and the output under attack denoted as \(\left\{ \widehat{\mathbf {y}}_k \right\} \))? More specifically, to achieve a certain degree of distortion in state, what is the maximum level of stealthiness that can be maintained by the attacker? And what is the worst-case attack in this sense? The following theorem, as the first main result of this chapter, answers the questions raised above.

Fig. 3.2
figure 2

A dynamical system under injection attack

Theorem 3.1

Consider the dynamical system under injection attacks depicted in Fig. 3.2. Suppose that the attacker aims to design the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) to satisfy the following attack goal in terms of state distortion:

$$\begin{aligned} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D. \end{aligned}$$
(3.5)

Then, the minimum KL divergence rate between the original output and the attacked output is given by

$$\begin{aligned} \inf _{\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega , \end{aligned}$$
(3.6)

where

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }, \end{aligned}$$
(3.7)

and \(S_{\mathbf {y}} \left( \omega \right) \) is given by

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right)&= \frac{b^2 c^2}{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2} S_{\mathbf {u}} \left( \omega \right) + \frac{c^2}{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2} \sigma _{\mathbf {w}}^2 + \sigma _{\mathbf {v}}^2. \end{aligned}$$
(3.8)

Herein, \(\zeta \) is the unique constant that satisfies

$$\begin{aligned} \frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \mathrm {d} \omega =c^2 D, \end{aligned}$$
(3.9)

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.10)

Moreover, the worst-case (in the sense of achieving this minimum KL divergence rate) attack \(\left\{ \mathbf {n}_{k} \right\} \) is a stationary colored Gaussian process with power spectrum

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \frac{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2}{b^2 c^2} \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.11)
Fig. 3.3
figure 3

A dynamical system under injection attack: equivalent system

Proof

\(\square \)To begin with, it can be verified that the power spectrum of \(\left\{ \mathbf {y}_{k} \right\} \) is given by

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right)&= \left| P \left( \mathrm {e}^{\mathrm {j} \omega } \right) \right| ^2 S_{\mathbf {u}} \left( \omega \right) + \frac{1}{b^2} \left| P \left( \mathrm {e}^{\mathrm {j} \omega } \right) \right| ^2 \sigma _{\mathbf {w}}^2 + \sigma _{\mathbf {v}}^2 \nonumber , \\&= \frac{b^2 c^2}{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2} S_{\mathbf {u}} \left( \omega \right) + \frac{c^2}{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2} \sigma _{\mathbf {w}}^2 + \sigma _{\mathbf {v}}^2. \nonumber \end{aligned}$$

Note then that due to the property of additivity of linear systems, the system in Fig. 3.2 is equivalent to that of Fig. 3.3, where

$$\begin{aligned} \widehat{\mathbf {y}}_{k} = \mathbf {y}_{k} + \widehat{\mathbf {n}}_{k} , \nonumber \end{aligned}$$

and \(\left\{ \widehat{\mathbf {n}}_{k} \right\} \) is the output of the subsystem

$$\begin{aligned} \left\{ \begin{array}{rcl} \widehat{\mathbf {x}}_{k+1} - \mathbf {x}_{k+1} &{} = &{} a \left( \widehat{\mathbf {x}}_{k} - \mathbf {x}_{k} \right) + b \mathbf {n}_{k},\\ \widehat{\mathbf {n}}_{k} &{} = &{} c \left( \widehat{\mathbf {x}}_{k} - \mathbf {x}_{k} \right) , \end{array} \right. \nonumber \end{aligned}$$

as depicted by the upper half of Fig. 3.3; note that in this subsystem, \(\left( \widehat{\mathbf {x}}_{k} - \mathbf {x}_{k} \right) \in \mathbb {R}\) is the system state, \(\mathbf {n}_{k} \in \mathbb {R}\) is the system input, and \(\widehat{\mathbf {n}} \in \mathbb {R}\) is the system output. On the other hand, the distortion constraint

$$\begin{aligned} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D \nonumber \end{aligned}$$

is then equivalent to being with a power constraint

$$\begin{aligned} \mathbb {E} \left[ \widehat{\mathbf {n}}_k^2 \right] \ge c^2 D, \nonumber \end{aligned}$$

since \(\widehat{\mathbf {n}}_k = \widehat{\mathbf {y}}_{k} - \mathbf {y}_{k}\) and thus

$$\begin{aligned} \widehat{\mathbf {n}}_k^2 = \left( \mathbf {y}_k - \widehat{\mathbf {y}}_{k} \right) ^2 = \left( c \mathbf {x}_k - c \widehat{\mathbf {x}}_{k} \right) ^2 = c^2 \left( \mathbf {x}_k - \widehat{\mathbf {x}}_{k} \right) ^2. \nonumber \end{aligned}$$

Accordingly, the system in Fig. 3.3 may be viewed as a “virtual channel” modeled as

$$\begin{aligned} \widehat{\mathbf {y}}_{k} = \mathbf {y}_{k} + \widehat{\mathbf {n}}_{k} \nonumber \end{aligned}$$

with noise constraint

$$\begin{aligned} \mathbb {E} \left[ \widehat{\mathbf {n}}_k^2 \right] \ge c^2 D, \nonumber \end{aligned}$$

where \(\left\{ \mathbf {y}_k \right\} \) is the channel input, \(\left\{ \widehat{\mathbf {y}}_k \right\} \) is the channel output, and \(\left\{ \widehat{\mathbf {n}}_k \right\} \) is the channel noise. In addition, due to the fact that \(\left\{ \mathbf {n}_{k} \right\} \) is independent of \(\left\{ \mathbf {y}_{k} \right\} \), \(\left\{ \widehat{\mathbf {n}}_{k} \right\} \) is also independent of \(\left\{ \mathbf {y}_{k} \right\} \).

The approach we shall take herein, as developed in Cover and Thomas (2006), is to treat the multiple uses of a scalar channel (i.e., a scalar dynamic channel) equivalently as a single use of parallel channels (i.e., a set of parallel static channels). We consider first the case of a finite number of parallel static channels with

$$\begin{aligned} \widehat{\mathbf {y}} = \mathbf {y} + \widehat{\mathbf {n}}, \nonumber \end{aligned}$$

where \( \mathbf {y}, \widehat{\mathbf {y}},\widehat{\mathbf {n}} \in \mathbb {R}^m\), and \( \widehat{\mathbf {n}} \) is independent of \( \mathbf {y} \). In addition, \(\mathbf {y}\) is Gaussian with covariance \(\varSigma _{\mathbf {y}}\), and the noise power constraint is given by

$$\begin{aligned} \text {tr}\left( \varSigma _{\widehat{\mathbf {n}} } \right) = \mathbb {E} \left[ \sum _{i=1}^{m} \widehat{\mathbf {n}}^{2}\left( i \right) \right] \ge c^2 D, \nonumber \end{aligned}$$

where \(\widehat{\mathbf {n}} \left( i \right) \) denotes the i-th element of \(\widehat{\mathbf {n}}\). In addition, according to Fang and Zhu (2020) (see Proposition 2 therein), we have

$$\begin{aligned} \mathrm {KL} \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \ge \mathrm {KL} \left( p_{\widehat{\mathbf {y}}^{\mathrm {G}}} \Vert p_{\mathbf {y}} \right) , \nonumber \end{aligned}$$

where \(\widehat{\mathbf {y}}^{\mathrm {G}}\) denotes a Gaussian random vector with the same covariance as \(\widehat{\mathbf {y}}\), and equality holds if \(\widehat{\mathbf {y}}\) is Gaussian. Meanwhile, it is known from Lemma 3.1 that

$$\begin{aligned} \mathrm {KL} \left( p_{\widehat{\mathbf {y}}^{\mathrm {G}}} \Vert p_{\mathbf {y}} \right) = \frac{1}{2} \left[ \text {tr}\left( \varSigma _{\widehat{\mathbf {y}}} \varSigma _{\mathbf {y}}^{-1} \right) - \ln \det \left( \varSigma _{\widehat{\mathbf {y}}} \varSigma _{\mathbf {y}}^{-1} \right) - m \right] . \nonumber \end{aligned}$$

On the other hand, since \(\mathbf {y}\) and \(\widehat{\mathbf {n}}\) are independent, we have

$$\begin{aligned} \varSigma _{\widehat{\mathbf {y}}} =\varSigma _{\widehat{\mathbf {n}} +\mathbf {y} } =\varSigma _{\widehat{\mathbf {n}}} +\varSigma _{\mathbf {y}}. \nonumber \end{aligned}$$

Consequently,

$$\begin{aligned} \text {tr}\left( \varSigma _{\widehat{\mathbf {y}}} \varSigma _{\mathbf {y}}^{-1} \right) - \ln \det \left( \varSigma _{\widehat{\mathbf {y}}} \varSigma _{\mathbf {y}}^{-1} \right) = \text {tr}\left[ \left( \varSigma _{\widehat{\mathbf {n}}} +\varSigma _{\mathbf {y}} \right) \varSigma _{\mathbf {y}}^{-1} \right] - \ln \det \left[ \left( \varSigma _{\widehat{\mathbf {n}}} +\varSigma _{\mathbf {y}} \right) \varSigma _{\mathbf {y}}^{-1} \right] . \nonumber \end{aligned}$$

Denote the eigendecomposition of \(\varSigma _{\mathbf {y}}\) by \(U_{\mathbf {y}} \varLambda _{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \), where

$$\begin{aligned} \varLambda _{\mathbf {y}} = \mathrm {diag} \left( \lambda _{1}, \ldots , \lambda _{m} \right) . \nonumber \end{aligned}$$

Then,

$$\begin{aligned}&\text {tr}\left[ \left( \varSigma _{\widehat{\mathbf {n}}} +\varSigma _{\mathbf {y}} \right) \varSigma _{\mathbf {y}}^{-1} \right] - \ln \det \left[ \left( \varSigma _{\widehat{\mathbf {n}}} +\varSigma _{\mathbf {y}} \right) \varSigma _{\mathbf {y}}^{-1} \right] \nonumber \\&~~~~ = \text {tr}\left[ \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \right) \left( U_{\mathbf {y}} \varLambda _{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \right) ^{-1} \right] - \ln \det \left[ \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \right) \left( U_{\mathbf {y}} \varLambda _{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \right) ^{-1} \right] , \nonumber \\&~~~~ = \text {tr}\left[ \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} U_{\mathbf {y}}^{\mathrm {T}} \right] - \ln \det \left[ \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} U_{\mathbf {y}}^{\mathrm {T}} \right] , \nonumber \\&~~~~ = \text {tr}\left[ U_{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} U_{\mathbf {y}}^{\mathrm {T}} \right] \nonumber \\&~~~~~~~~ - \ln \det \left[ U_{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} U_{\mathbf {y}}^{\mathrm {T}} \right] , \nonumber \\&~~~~ = \text {tr}\left\{ U_{\mathbf {y}} \left[ U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} \right] U_{\mathbf {y}}^{\mathrm {T}} \right\} \nonumber \\&~~~~~~~~ - \ln \det \left\{ U_{\mathbf {y}} \left[ U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} \right] U_{\mathbf {y}}^{\mathrm {T}} \right\} ,\nonumber \\&~~~~ = \text {tr}\left[ U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} \right] - \ln \det \left[ U_{\mathbf {y}}^{\mathrm {T}} \left( \varSigma _{\widehat{\mathbf {n}}} + U_{\mathbf {y}} \varLambda _{\mathbf {y}} U_{\mathbf {y}}^{\mathrm {T}} \right) U_{\mathbf {y}} \varLambda _{\mathbf {y}}^{-1} \right] , \nonumber \\&~~~~ = \text {tr}\left[ \left( U_{\mathbf {y}}^{\mathrm {T}} \varSigma _{\widehat{\mathbf {n}}} U_{\mathbf {y}} + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right] - \ln \det \left[ \left( U_{\mathbf {y}}^{\mathrm {T}} \varSigma _{\widehat{\mathbf {n}}} U_{\mathbf {y}} + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right] , \nonumber \\&~~~~ = \text {tr}\left[ \left( \overline{\varSigma }_{\widehat{\mathbf {n}} } + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right] - \ln \det \left[ \left( \overline{\varSigma }_{\widehat{\mathbf {n}} } + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right] , \nonumber \end{aligned}$$

where \(\overline{\varSigma }_{\widehat{\mathbf {n}} } =U^{\mathrm {T}}_{\mathbf {y}}\varSigma _{\widehat{\mathbf {n}}} U_{\mathbf {y}} \). Denoting the diagonal terms of \(\overline{\varSigma }_{\widehat{\mathbf {n}} }\) by \(\overline{\sigma }_{\widehat{\mathbf {n}} \left( i \right) }^2, i=1,\ldots ,m\), it is known from (Fang and Zhu 2020) (see Proposition 4 therein) that

$$\begin{aligned} \text {tr}\left[ \left( \overline{\varSigma }_{\widehat{\mathbf {n}} } + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right]&- \ln \det \left[ \left( \overline{\varSigma }_{\widehat{\mathbf {n}} } + \varLambda _{\mathbf {y}} \right) \varLambda _{\mathbf {y}}^{-1} \right] , \nonumber \\&\ge \sum _{i=1}^{m} \left[ \frac{ \overline{\sigma }_{\widehat{\mathbf {n}} \left( i \right) }^2 + \lambda _{i} }{\lambda _{i}} \right] - \sum _{i=1}^m \ln \left[ \frac{ \overline{\sigma }_{\widehat{\mathbf {n}} \left( i \right) }^2 + \lambda _{i}}{\lambda _{i}} \right] , \nonumber \\&= \sum _{i=1}^{m} \left[ 1 + \frac{ \overline{\sigma }_{\widehat{\mathbf {n}} \left( i \right) }^2}{\lambda _{i}} \right] - \sum _{i=1}^m \ln \left[ 1 + \frac{ \overline{\sigma }_{\widehat{\mathbf {n}} \left( i \right) }^2 }{\lambda _{i}} \right] , \nonumber \end{aligned}$$

where equality holds if \(\overline{\varSigma }_{\widehat{\mathbf {n}}}\) is diagonal. For simplicity, we denote

$$\begin{aligned} \overline{\varSigma }_{\widehat{\mathbf {n}} }=\mathrm {diag}\left( \overline{\sigma }_{\widehat{\mathbf {n}} \left( 1\right) }^2,\ldots , \overline{\sigma }_{\widehat{\mathbf {n}} \left( m\right) }^2 \right) =\mathrm {diag}\left( \widehat{N}_{1},\ldots ,\widehat{N}_{m}\right) \nonumber \end{aligned}$$

when \(\overline{\varSigma }_{\widehat{\mathbf {n}} }\) is diagonal. Then, the problem reduces to that of choosing \(\widehat{N}_1,\ldots , \widehat{N}_m\) to minimize

$$\begin{aligned} \sum _{i=1}^{m} \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) - \sum _{i=1}^m \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) \nonumber \end{aligned}$$

subject to the constraint that

$$\begin{aligned} \sum _{i=1}^{m} \widehat{N}_{i} = \text {tr}\left( \overline{\varSigma }_{\widehat{\mathbf {n}}} \right) = \text {tr}\left( U^{\mathrm {T}}_{\mathbf {y}}\varSigma _{\widehat{\mathbf {n}}} U_{\mathbf {y}} \right) = \text {tr}\left( \varSigma _{\widehat{\mathbf {n}}} U_{\mathbf {y}} U^{\mathrm {T}}_{\mathbf {y}} \right) = \text {tr}\left( \varSigma _{\widehat{\mathbf {n}}} \right) = m c^2 D. \nonumber \end{aligned}$$

Define the Lagrange function by

$$\begin{aligned} \sum _{i=1}^{m} \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) - \sum _{i=1}^m \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) + \eta \left( \sum _{i=1}^{m} \widehat{N}_{i}- \widehat{N}\right) , \nonumber \end{aligned}$$

and differentiate it with respect to \(\widehat{N}_{i}\), then we have

$$\begin{aligned} \frac{1}{\lambda _{i}} - \frac{1}{\widehat{N}_{i}+ \lambda _{i}} + \eta =0, \nonumber \end{aligned}$$

or equivalently,

$$\begin{aligned} \widehat{N}_{i} = \frac{1}{\frac{1}{\lambda _{i}} + \eta } - \lambda _{i} = \frac{\lambda _{i}}{ 1 + \eta \lambda _{i}} - \lambda _{i} = \frac{ - \eta \lambda _{i}^2}{ 1 + \eta \lambda _{i}}, \nonumber \end{aligned}$$

where \(\eta \) satisfies

$$\begin{aligned} \sum _{i=1}^{m} \widehat{N}_{i} = \sum _{i=1}^{m} \frac{- \eta \lambda _{i}^2}{ 1 + \eta \lambda _{i}} = m c^2 D, \nonumber \end{aligned}$$

while

$$\begin{aligned} - \min _{i = 0, \ldots , m} \frac{1}{ \lambda _{i}}< \eta < 0. \nonumber \end{aligned}$$

For simplicity, we denote \(\zeta = - \eta \), and accordingly,

$$\begin{aligned} \widehat{N}_{i} = \frac{ \zeta \lambda _{i}^2}{ 1 - \zeta \lambda _{i}}, \nonumber \end{aligned}$$

where \(\zeta \) satisfies

$$\begin{aligned} \sum _{i=1}^{m} \widehat{N}_{i} = \sum _{i=1}^{m} \frac{ \zeta \lambda _{i}^2}{ 1 - \zeta \lambda _{i}} = m c^2 D, \nonumber \end{aligned}$$

while

$$\begin{aligned} 0< \zeta < \min _{i = 0, \ldots , m} \frac{1}{ \lambda _{i}} . \nonumber \end{aligned}$$

Correspondingly,

$$\begin{aligned} \inf _{p_{\widehat{\mathbf {n}}}} \mathrm {KL} \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right)&= \frac{1}{2} \left[ \sum _{i=1}^{m} \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) - \sum _{i=1}^m \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) - m \right] , \nonumber \\&= \sum _{i=1}^m \frac{1}{2} \left[ \frac{ \widehat{N}_{i}}{\lambda _{i}} - \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) \right] . \nonumber \end{aligned}$$

Consider now a scalar dynamic channel

$$\begin{aligned} \widehat{\mathbf {y}}_{k} = \mathbf {y}_{k} + \widehat{\mathbf {n}}_{k}, \nonumber \end{aligned}$$

where \( \mathbf {y}_{k}, \widehat{\mathbf {n}}_{k}, \widehat{\mathbf {y}}_{k} \in \mathbb {R}\), while \( \left\{ \mathbf {y}_{k} \right\} \) and \( \left\{ \widehat{\mathbf {n}}_{k} \right\} \) are independent. In addition, \(\left\{ \mathbf {y}_{k} \right\} \) is stationary colored Gaussian with power spectrum \(S_{\mathbf {y}} \left( \omega \right) \), whereas the noise power constraint is given by \(\mathbb {E} \left[ \widehat{\mathbf {n}}^{2}_{k} \right] \ge c^2 D\). We may then consider a block of consecutive uses from time 0 to k of this channel as \(k+1\) channels in parallel Cover and Thomas (2006). Particularly, let the eigendecomposition of \(\varSigma _{\mathbf {y}_{0,\ldots ,k}}\) be given by

$$\begin{aligned} \varSigma _{\mathbf {y}_{0,\ldots ,k}}=U_{\mathbf {y}_{0,\ldots ,k}}\varLambda _{\mathbf {y}_{0,\ldots ,k}}U^{\mathrm {T}}_{\mathbf {y}_{0,\ldots ,k}}, \nonumber \end{aligned}$$

where

$$\begin{aligned} \varLambda _{\mathbf {y}_{0,\ldots ,k}} =\mathrm {diag} \left( \lambda _{0},\ldots ,\lambda _{k} \right) . \nonumber \end{aligned}$$

Then, we have

$$\begin{aligned} \min _{p_{\widehat{\mathbf {n}}_{0,\ldots ,k}}:~\sum _{i=0}^{k} \mathbb {E} \left[ \widehat{\mathbf {n}}_{i}^{2} \right] \ge \left( k+1\right) c^2 D} \frac{\mathrm {KL} \left( p_{\widehat{\mathbf {y}}_{0,\ldots ,k}} \Vert p_{\mathbf {y}_{0,\ldots ,k}} \right) }{k+1} = \frac{1}{k+1} \sum _{i=0}^{k} \frac{1}{2} \left[ \frac{ \widehat{N}_{i}}{\lambda _{i}} - \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) \right] , \nonumber \end{aligned}$$

where

$$\begin{aligned} \widehat{N}_{i} = \frac{\zeta \lambda _{i}^2}{ 1 - \zeta \lambda _{i}},~i=0,\ldots ,k. \nonumber \end{aligned}$$

Herein, \(\zeta \) satisfies

$$\begin{aligned} \sum _{i=0}^{k} \widehat{N}_{i} = \sum _{i=0}^{k} \frac{\zeta \lambda _{i}^2}{ 1 - \zeta \lambda _{i}} = \left( k+1 \right) c^2 D, \nonumber \end{aligned}$$

or equivalently,

$$\begin{aligned} \frac{1}{k+1} \sum _{i=0}^{k} \widehat{N}_{i} = \frac{1}{k+1} \left( \frac{\zeta \lambda _{i}^2}{ 1 - \zeta \lambda _{i}} \right) = c^2 D, \nonumber \end{aligned}$$

while

$$\begin{aligned} 0< \zeta < \min _{i = 0, \ldots , k} \frac{1}{ \lambda _{i}}. \nonumber \end{aligned}$$

In addition, since the processes \( \left\{ \mathbf {y}_{k} \right\} \), \( \left\{ \widehat{\mathbf {n}}_{k} \right\} \), and \( \left\{ \widehat{\mathbf {y}}_{k} \right\} \) are stationary, we have

$$\begin{aligned}&\lim _{k\rightarrow \infty } \min _{p_{\widehat{\mathbf {n}}_{0,\ldots ,k}}:~\sum _{i=0}^{k} \mathbb {E} \left[ \widehat{\mathbf {n}}_{i}^{2} \right] \ge \left( k+1\right) c^2 D} \frac{\mathrm {KL} \left( p_{\widehat{\mathbf {y}}_{0, \ldots , k}} \Vert p_{\mathbf {y}_{0, \ldots , k}} \right) }{k+1} \nonumber \\&~~~~ = \inf _{\mathbb {E} \left[ \widehat{\mathbf {n}}_k^{2} \right] \ge c^2D} \lim _{k \rightarrow \infty } \frac{\mathrm {KL} \left( p_{\widehat{\mathbf {y}}_{0, \ldots , k}} \Vert p_{\mathbf {y}_{0, \ldots , k}} \right) }{k+1} = \inf _{\mathbb {E} \left[ \widehat{\mathbf {n}}_k^{2} \right] \ge c^2D} \limsup _{k \rightarrow \infty } \frac{\mathrm {KL} \left( p_{\widehat{\mathbf {y}}_{0, \ldots , k}} \Vert p_{\mathbf {y}_{0, \ldots , k}} \right) }{k+1} \nonumber \\&~~~~ = \inf _{\mathbb {E} \left[ \widehat{\mathbf {n}}_k^{2} \right] \ge c^2D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) = \inf _{\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) . \nonumber \end{aligned}$$

On the other hand, since the processes are stationary, the covariance matrices are Toeplitz (Grenander and Szegö 1958), and their eigenvalues approach their limits as \(k \rightarrow \infty \). Moreover, the densities of eigenvalues on the real line tend to the power spectra of the processes (Gutiérrez-Gutiérrez and Crespo 2008; Lindquist and Picci 2015; Pinsker 1964). Accordingly,

$$\begin{aligned}&\inf _{\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) = \lim _{k\rightarrow \infty } \frac{1}{k+1} \sum _{i=0}^{k} \frac{1}{2} \left[ \frac{ \widehat{N}_{i}}{\lambda _{i}} - \ln \left( 1 + \frac{ \widehat{N}_{i}}{\lambda _{i}} \right) \right] , \nonumber \\&~~~~ = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega , \nonumber \end{aligned}$$

where

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }, \nonumber \end{aligned}$$

and \(\zeta \) satisfies

$$\begin{aligned} \lim _{k\rightarrow \infty } \frac{1}{k+1}\sum _{i=0}^{k} \widehat{N}_{i} =\frac{1}{2\pi } \int _{-\pi }^{\pi } S_{\widehat{\mathbf {n}}} \left( \omega \right) \mathrm {d} \omega =\frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \mathrm {d} \omega =c^2 D, \nonumber \end{aligned}$$

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \nonumber \end{aligned}$$

Lastly, note that

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \left| P \left( \mathrm {e}^{\mathrm {j} \omega } \right) \right| ^2 S_{\mathbf {n}} \left( \omega \right) = \frac{b^2 c^2}{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2} S_{\mathbf {n}} \left( \omega \right) , \nonumber \end{aligned}$$

and hence

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \frac{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2}{b^2 c^2} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \frac{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2}{b^2 c^2} \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \nonumber \end{aligned}$$

This concludes the proof. \(\blacksquare \)

It is clear that \(S_{\mathbf {n}} \left( \omega \right) \) may be rewritten as

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \frac{1}{\left| P \left( \mathrm {e}^{\mathrm {j} \omega } \right) \right| ^2} \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.12)

This means that the attacker only needs the knowledge of the power spectrum of the original system output \(\left\{ \mathbf {y}_{k} \right\} \) and the transfer function of the system (from \(\left\{ \mathbf {n}_{k} \right\} \) to \(\left\{ \widehat{\mathbf {y}}_{k} \right\} \)), i.e., \(P \left( z \right) \), in order to carry out this worst-case attack. It is worth mentioning that the power spectrum of \(\left\{ \mathbf {y}_{k} \right\} \) can be estimated based on its realizations (see, e.g., Stoica and Moses (2005)), while the transfer function of the system can be approximated by system identification (see, e.g., Ljung (1999)).

Note that it can be verified (Kay 2020) that the (minimum) output KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) increases strictly with the state distortion bound D. In other words, in order for the attacker to achieve larger distortion, the stealthiness level of the attack will inevitably decrease.

On the other hand, the dual problem to that of Theorem 3.1 would be: Given a certain stealthiness level in output, what is the maximum distortion in state that can be achieved by the attacker? And what is the corresponding attack? The following corollary answers these questions.

Corollary 3.1

Consider the dynamical system under injection attacks depicted in Fig. 3.2. Then, in order for the attacker to ensure that the KL divergence rate between the original output and the attacked output is upper bounded by a (positive) constant R as

$$\begin{aligned} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \le R, \end{aligned}$$
(3.13)

the maximum state distortion \(\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \) that can be achieved is given by

$$\begin{aligned} \sup _{\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \le R} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] = \frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{1}{c^2} \left[ \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \right] \mathrm {d} \omega , \end{aligned}$$
(3.14)

where \(\zeta \) is the unique constant that satisfies

$$\begin{aligned}&\frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{\frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) }}{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{\frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) }}{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega \nonumber \\&~~~~ = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{\zeta S_{\mathbf {y}} \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ \frac{1}{1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega = R, \end{aligned}$$
(3.15)

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.16)

Note that herein \(S_{\mathbf {y}} \left( \omega \right) \) is given by (3.8). Moreover, this maximum distortion is achieved when the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) is chosen as a stationary colored Gaussian process with power spectrum

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \frac{\left| \mathrm {e}^{\mathrm {j} \omega } - a \right| ^2}{b^2 c^2} \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.17)
Fig. 3.4
figure 4

A feedback control system

3.3.2 Feedback Control Systems

We will now proceed to examine (closed-loop) feedback control systems in this subsection. Specifically, consider the feedback control system depicted in Fig. 3.4, where the state-space model of the plant is given by

$$\begin{aligned} \left\{ \begin{array}{rcl} \mathbf {x}_{k+1} &{} = &{} a \mathbf {x}_{k} + b \mathbf {u}_{k} + \mathbf {w}_k,\\ \mathbf {y}_{k} &{} = &{} c \mathbf {x}_{k} +\mathbf {v}_k, \end{array} \right. \nonumber \end{aligned}$$

while \(K \left( z \right) \) is the transfer function of the (dynamic) output controller. Herein, \(\mathbf {x}_{k} \in \mathbb {R}\) is the plant state, \(\mathbf {u}_{k} \in \mathbb {R}\) is the plant input, \(\mathbf {y}_{k} \in \mathbb {R}\) is the plant output, \(\mathbf {w}_{k} \in \mathbb {R}\) is the process noise, and \(\mathbf {v}_{k} \in \mathbb {R}\) is the measurement noise. The system parameters are \( a \in \mathbb {R}\), \( {b} \in \mathbb {R}\), and \( {c} \in \mathbb {R}\). Note that the plant is not necessarily stable. Meanwhile, we assume that \(b,c \ne 0\), i.e., the plant is controllable and observable, and thus can be stabilized by controller \(K \left( z \right) \). On the other hand, the transfer function of the plant is given by

$$\begin{aligned} P \left( z \right) = \frac{bc}{z - a}. \end{aligned}$$
(3.18)

Suppose that \(\left\{ \mathbf {w}_{k} \right\} \) and \(\left\{ \mathbf {v}_{k} \right\} \) are stationary white Gaussian with variances \(\sigma _{\mathbf {w}}^2\) and \(\sigma _{\mathbf {v}}^2\), respectively. Furthermore, \(\left\{ \mathbf {w}_{k} \right\} \), \(\left\{ \mathbf {v}_{k} \right\} \), and \(\mathbf {x}_{0}\) are assumed to be mutually independent. Assume also that \(K \left( z \right) \) stabilizes \(P \left( z \right) \), i.e., the closed-loop system is stable. Accordingly, \(\left\{ \mathbf {x}_{k} \right\} \) and \(\left\{ \mathbf {y}_{k} \right\} \) are both stationary, and denote their power spectra by \(S_{\mathbf {x}} \left( \omega \right) \) and \(S_{\mathbf {y}} \left( \omega \right) \), respectively.

Fig. 3.5
figure 5

A feedback control system under actuator attack

Consider then the scenario that an attack signal \(\left\{ \mathbf {n}_{k} \right\} , \mathbf {n}_{k} \in \mathbb {R}\), is to be added to the input of the plant \(\left\{ \mathbf {u}_{k} \right\} \) to deviate the plant state, while aiming to be stealthy in the plant output; see the depiction in Fig. 3.5. In fact, this corresponds to actuator attack. Note in particular that since we are now considering a closed-loop system, the presence of \(\left\{ \mathbf {n}_{k} \right\} \) will eventually distort the original \(\left\{ \mathbf {u}_{k} \right\} \) (through feedback) as well, which is an essential difference form the open-loop system setting considered in Sect. 3.3.1, and the distorted \(\left\{ \mathbf {u}_{k} \right\} \) will be denoted as \(\left\{ \overline{\mathbf {u}}_{k} \right\} \). In addition, we denote the true plant input under attack as \(\left\{ \widehat{\mathbf {u}}_{k} \right\} \), where

$$\begin{aligned} \widehat{\mathbf {u}}_{k} = \overline{\mathbf {u}}_{k} + \mathbf {n}_{k}, \end{aligned}$$
(3.19)

whereas the plant under attack \(\left\{ \mathbf {n}_{k} \right\} \) is given by

$$\begin{aligned} \left\{ \begin{array}{rcl} \widehat{\mathbf {x}}_{k+1} &{} = &{} a \widehat{\mathbf {x}}_{k} + b \widehat{\mathbf {u}}_{k} + \mathbf {w}_k = a \widehat{\mathbf {x}}_{k} + b \overline{\mathbf {u}}_{k} + b \mathbf {n}_{k} + \mathbf {w}_k,\\ \widehat{\mathbf {y}}_{k} &{} = &{} c \widehat{\mathbf {x}}_{k} +\mathbf {v}_k. \end{array} \right. \end{aligned}$$
(3.20)

Meanwhile, suppose that the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) is independent of \(\left\{ \mathbf {w}_{k} \right\} \), \(\left\{ \mathbf {v}_{k} \right\} \), and \(\mathbf {x}_{0}\); consequently, \(\left\{ \mathbf {n}_{k} \right\} \) is independent of \(\left\{ \mathbf {x}_{k} \right\} \) and \(\left\{ \mathbf {y}_{k} \right\} \) as well.

The following theorem, as the second main result of this chapter, characterizes the fundamental trade-off between the distortion in state and the stealthiness in output for feedback control systems.

Theorem 3.2

Consider the feedback control system under injection attacks depicted in Fig. 3.5. Suppose that the attacker needs to design the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) to satisfy the following attack goal in terms of state distortion:

$$\begin{aligned} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D. \end{aligned}$$
(3.21)

Then, the minimum KL divergence rate between the original output and the attacked output is given by

$$\begin{aligned} \inf _{\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega , \end{aligned}$$
(3.22)

where

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }, \end{aligned}$$
(3.23)

and \(S_{\mathbf {y}} \left( \omega \right) \) is given by

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right) = \left| \frac{c }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {w}}^2 + \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {v}}^2. \end{aligned}$$
(3.24)

Herein, \(\zeta \) is the unique constant that satisfies

$$\begin{aligned} \frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \mathrm {d} \omega =c^2 D, \end{aligned}$$
(3.25)

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.26)

Moreover, the worst-case attack \(\left\{ \mathbf {n}_{k} \right\} \) is a stationary colored Gaussian process with power spectrum

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c}{bc } \right| ^2 \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.27)
Fig. 3.6
figure 6

A feedback control system under actuator attack: equivalent system

Proof

\(\square \)Note first that when the closed-loop system is stable, the power spectrum of \(\left\{ \mathbf {y}_{k} \right\} \) is given by

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right)&= \frac{1}{b^2} \left| \frac{ P \left( \mathrm {e}^{\mathrm {j} \omega } \right) }{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 \sigma _{\mathbf {w}}^2 + \left| \frac{1}{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 \sigma _{\mathbf {v}}^2 \nonumber , \\&= \frac{1}{b^2} \left| \frac{\frac{b c}{ \mathrm {e}^{\mathrm {j} \omega } - a }}{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) \frac{b c}{ \mathrm {e}^{\mathrm {j} \omega } - a }} \right| ^2 \sigma _{\mathbf {w}}^2 + \left| \frac{1}{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) \frac{b c}{ \mathrm {e}^{\mathrm {j} \omega } - a }} \right| ^2 \sigma _{\mathbf {v}}^2 \nonumber , \\&= \left| \frac{c }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {w}}^2 + \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {v}}^2. \nonumber \end{aligned}$$

Note then that since the systems are linear, the system in Fig. 3.5 is equivalent to that of Fig. 3.6, where

$$\begin{aligned} \widehat{\mathbf {y}}_{k} = \mathbf {y}_{k} + \widehat{\mathbf {n}}_{k} , \nonumber \end{aligned}$$

and \(\left\{ \widehat{\mathbf {n}}_{k} \right\} \) is the output of the closed-loop system composed by the controller \(K \left( z \right) \) and the plant

$$\begin{aligned} \left\{ \begin{array}{rcl} \widehat{\mathbf {x}}_{k+1} - \mathbf {x}_{k+1} &{} = &{} a \left( \widehat{\mathbf {x}}_{k} - \mathbf {x}_{k} \right) + b \left( \overline{\mathbf {u}}_{k} - \mathbf {u}_{k} \right) + b \mathbf {n}_{k},\\ \widehat{\mathbf {n}}_{k} &{} = &{} c \left( \widehat{\mathbf {x}}_{k} - \mathbf {x}_{k} \right) , \end{array} \right. \nonumber \end{aligned}$$

as depicted by the upper half of Fig. 3.6. Meanwhile, as in the case of Fig. 3.3, the system in Fig. 3.6 may also be viewed as a “virtual channel” modeled as

$$\begin{aligned} \widehat{\mathbf {y}}_{k} = \mathbf {y}_{k} + \widehat{\mathbf {n}}_{k} \nonumber \end{aligned}$$

with noise constraint

$$\begin{aligned} \mathbb {E} \left[ \widehat{\mathbf {n}}_k^2 \right] \ge c^2 D, \nonumber \end{aligned}$$

where \(\left\{ \mathbf {y}_k \right\} \) is the channel input, \(\left\{ \widehat{\mathbf {y}}_k \right\} \) is the channel output, and \(\left\{ \widehat{\mathbf {n}}_k \right\} \) is the channel noise that is independent of \(\left\{ \mathbf {y}_k \right\} \). Then, following procedures similar to those in the proof of Theorem 3.1, it can be derived that

$$\begin{aligned} \inf _{\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \ge D} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{S_{\widehat{\mathbf {n}}} \left( \omega \right) }{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega , \nonumber \end{aligned}$$

where

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right) = \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }, \nonumber \end{aligned}$$

and \(\zeta \) is the unique constant that satisfies

$$\begin{aligned} \frac{1}{2\pi } \int _{-\pi }^{\pi } S_{\widehat{\mathbf {n}}} \left( \omega \right) \mathrm {d} \omega =\frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \mathrm {d} \omega =c^2 D, \nonumber \end{aligned}$$

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \nonumber \end{aligned}$$

In addition, since

$$\begin{aligned} S_{\widehat{\mathbf {n}}} \left( \omega \right)&= \left| \frac{ P \left( \mathrm {e}^{\mathrm {j} \omega } \right) }{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 S_{\mathbf {n}} \left( \omega \right) = \left| \frac{\frac{b c}{ \mathrm {e}^{\mathrm {j} \omega } - a }}{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) \frac{b c}{ \mathrm {e}^{\mathrm {j} \omega } - a }} \right| ^2 S_{\mathbf {n}} \left( \omega \right) , \nonumber \\&= \left| \frac{b c }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 S_{\mathbf {n}} \left( \omega \right) , \nonumber \end{aligned}$$

we have

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c}{b c } \right| ^2 S_{\widehat{\mathbf {n}}} \left( \omega \right) = \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c}{b c } \right| ^2 \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \nonumber \end{aligned}$$

This concludes the proof. \(\blacksquare \)

It is worth mentioning that the \(S_{\mathbf {y}} \left( \omega \right) \) for Theorem 3.2 is given by (3.24), which differs significantly from that given by (3.8) for Theorem 3.1, although the notations are the same. Accordingly, \(\eta \), \(S_{\mathbf {n}} \left( \omega \right) \), and so on, will all be different between the two cases in spite of the same notations.

Note also that \(S_{\mathbf {n}} \left( \omega \right) \) can be rewritten as

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \left| \frac{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) }{ P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }, \end{aligned}$$
(3.28)

which indicates that the attacker only needs to know the power spectrum of the original system output \(\left\{ \mathbf {y}_k \right\} \) and the transfer function of the closed-loop system (from \(\left\{ \mathbf {n}_k \right\} \) to \(\left\{ \widehat{\mathbf {y}}_k \right\} \)), i.e.,

$$\begin{aligned} \frac{P \left( z \right) }{ 1 + K \left( z \right) P \left( z \right) }, \end{aligned}$$
(3.29)

in order to carry out this worst-case attack.

Again, we may examine the dual problem as follows.

Corollary 3.2

Consider the feedback control system under injection attacks depicted in Fig. 3.5. Then, in order for the attacker to ensure that the KL divergence rate between the original output and the attacked output is upper bounded by a (positive) constant R as

$$\begin{aligned} \mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \le R, \end{aligned}$$
(3.30)

the maximum state distortion \(\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \) that can be achieved is given by

$$\begin{aligned} \sup _{\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \le R} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] = \frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{1}{c^2} \left[ \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \right] \mathrm {d} \omega , \end{aligned}$$
(3.31)

where \(\zeta \) satisfies

$$\begin{aligned}&\frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{\frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) }}{S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ 1 + \frac{\frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) }}{S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega \nonumber \\&~~~~ = \frac{1}{2 \pi } \int _{0}^{2 \pi } \frac{1}{2} \left\{ \frac{\zeta S_{\mathbf {y}} \left( \omega \right) }{1 - \zeta S_{\mathbf {y}} \left( \omega \right) } - \ln \left[ \frac{1}{1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \right] \right\} \mathrm {d} \omega = R, \end{aligned}$$
(3.32)

while

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.33)

Note that herein \(S_{\mathbf {y}} \left( \omega \right) \) is given by (3.24). Moreover, this maximum distortion is achieved when the attack signal \(\left\{ \mathbf {n}_{k} \right\} \) is chosen as a stationary colored Gaussian process with power spectrum

$$\begin{aligned} S_{\mathbf {n}} \left( \omega \right) = \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c}{b c } \right| ^2 \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) }. \end{aligned}$$
(3.34)

3.4 Simulation

In this section, we will utilize (toy) numerical examples to illustrate the fundamental stealthiness–distortion trade-offs in linear Gaussian open-loop dynamical systems as well as (closed-loop) feedback control systems.

Consider first open-loop dynamical systems as in Sect. 3.3.1. Let \(a=0.5, b = 1, c = 1\), \(\sigma _{\mathbf {w}}^2 = 1, \sigma _{\mathbf {v}}^2 = 1\), and \(S_{\mathbf {u}} \left( \omega \right) = 1\) therein for simplicity. Accordingly, we have

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right) = \frac{2}{\left| \mathrm {e}^{\mathrm {j} \omega } - 0.5 \right| ^2} + 1 = \frac{2}{\left( \cos \omega - 0.5 \right) ^2 + \sin ^2 \omega } + 1. \nonumber \end{aligned}$$

In such a case, the relation between the minimum KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) (denoted as KL in the figure) and the distortion bound D is illustrated in Fig. 3.7. It is clear that KL increases (strictly) with D, i.e., in order for the attacker to achieve larger distortion, the stealthiness level of the attack will inevitably decrease.

Fig. 3.7
figure 7

The relation between \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) (denoted as KL) and D in Open-Loop Dynamical Systems

Note that the relation between the maximum distortion \(\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \) and the KL divergence rate bound R in Corollary 3.1 is essentially the same as that between the distortion bound D and the minimum KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) in Theorem 3.1.

Fig. 3.8
figure 8

The relation between \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) (denoted as KL) and D in Feedback Control Systems

Consider then feedback control systems as in Sect. 3.3.2. Let \(a = 2, b = 1, c = 1\), \(\sigma _{\mathbf {w}}^2 = 1, \sigma _{\mathbf {v}}^2 = 1\), and \(K \left( z \right) = 2\) therein for simplicity. Accordingly, we have

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right) = 1 + \left| \mathrm {e}^{\mathrm {j} \omega } - 2 \right| ^2 = 1 + \left( \cos \omega - 2 \right) ^2 + \sin ^2 \omega . \nonumber \end{aligned}$$

In such a case, the relation between the minimum KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) (denoted as KL in the figure) and the distortion bound D is illustrated in Fig. 3.8. Again, KL increases (strictly) with D, whereas the relationship between the maximum distortion \(\mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] \) and the KL divergence rate bound R in Corollary 3.2 is essentially the same as that between the distortion bound D and the minimum KL divergence rate \(\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \) in Theorem 3.2.

3.5 Conclusion

In this chapter, we have presented the fundamental stealthiness–distortion trade-offs of linear Gaussian open-loop dynamical systems and (closed-loop) feedback control systems under data injection attacks, and explicit formulas have been obtained in terms of power spectra that characterize analytically the stealthiness–distortion trade-offs as well as the properties of the worst-case attacks.

So why do we care about explicit formulas in the first place? One value of the explicit stealthiness–distortion trade-off formula for feedback control systems, for instance, is that they render the subsequent controller design explicit (and intuitive) as well. To be more specific, given a threshold on the output stealthiness, it is already known from Corollary 3.2 what the maximum distortion in state that can be achieved by the attacker is. Then, one natural control design criterion will be to design the controller \(K \left( z \right) \) so as to minimize this maximum distortion. Mathematically, this minimax problem can be formulated as follows:

$$\begin{aligned} \inf _{K \left( z \right) } \sup _{\mathrm {KL}_{\infty } \left( p_{\widehat{\mathbf {y}}} \Vert p_{\mathbf {y}} \right) \le R} \mathbb {E} \left[ \left( \widehat{\mathbf {x}}_k - \mathbf {x}_{k} \right) ^{2} \right] = \inf _{K \left( z \right) } \left\{ \frac{1}{2\pi } \int _{-\pi }^{\pi } \frac{1}{c^2} \left[ \frac{\zeta S_{\mathbf {y}}^2 \left( \omega \right) }{ 1 - \zeta S_{\mathbf {y}} \left( \omega \right) } \right] \mathrm {d} \omega \right\} , \nonumber \end{aligned}$$

where

$$\begin{aligned} S_{\mathbf {y}} \left( \omega \right)&= \left| \frac{c }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {w}}^2 + \left| \frac{\mathrm {e}^{\mathrm {j} \omega } - a }{\mathrm {e}^{\mathrm {j} \omega } - a + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) b c} \right| ^2 \sigma _{\mathbf {v}}^2 \nonumber , \\&= \left| \frac{ P \left( \mathrm {e}^{\mathrm {j} \omega } \right) }{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 \frac{\sigma _{\mathbf {w}}^2 }{b^2} + \left| \frac{ 1}{1 + K \left( \mathrm {e}^{\mathrm {j} \omega } \right) P \left( \mathrm {e}^{\mathrm {j} \omega } \right) } \right| ^2 \sigma _{\mathbf {v}}^2, \nonumber \end{aligned}$$

whereas the infimum is taken over all \(K \left( z \right) \) that stabilizes the plant \(P \left( z \right) \). Herein, \(\zeta \) can be treated as a tuning parameter as long as it satisfies

$$\begin{aligned} 0< \zeta < \min _{\omega } \frac{1}{S_{\mathbf {y}} \left( \omega \right) }. \nonumber \end{aligned}$$

We will, however, leave more detailed investigations of this formulation to future research.

Other potential future research directions include the investigation of such trade-offs for state estimation systems. It might also be interesting to examine the security–privacy trade-offs (see, e.g., Farokhi and Esfahani (2018), Fang and Zhu (2020, 2021).