1. INTRODUCTION

At different periods of development of automatic control theory for linear time-invariant systems, the developers imposed different requirements on the operation of control systems. At the initial period, the basic problem facing the developers of automatic control systems was to ensure the stability of the closed-loop system. As the goals of the operation of control systems became more sophisticated, more complex requirements were imposed on the systems, including ensuring the prescribed characteristics of transient processes, optimizing some parameters and characteristics of the closed-loop control system, etc. All this has led to the rise of optimal control theory using traditional and new optimization methods.

One important characteristic of control systems is the energy that the system spends in the course of its operation. If the system is described by differential equations that include the control as a parameter, then the energy is an integral of a quadratic form in which the arguments are the plant state and the control constructed for this plant. For a discrete-time representation of the control system, the energy is a certain sum that is an analog of the integral in the continuous case. The problem of minimizing the energy consumed by the control system is one of the most important problems in optimal control theory. This problem has found its application in the synthesis of control for many technical systems. Energy is an averaged characteristic of a control system. The mathematical model of such a performance criterion can be represented in the form of the \( H_2\)-norm of the closed-loop system. Among many other performance criteria, one can distinguish the criterion for rejecting the worst disturbance in a given set acting on the system. The guaranteed rejection of the worst disturbance is an extremely important problem in designing automatic control systems for plants that must maintain operating capacity under extreme conditions such as control systems for aircraft, nuclear reactors, emergency situations, etc. Control systems with such performance criteria belong to the class of minimax (game) systems. The mathematical model of such a performance criterion is the \( H_{\infty }\)-norm of the closed-loop system.

Both of these criteria can be viewed as different ways to reject external disturbances for the linear problem under various assumptions on input signals acting on the system.

The analysis of these problems has revealed some common points in the solution. This has led the researchers to the idea that there must be a theory in some sense generalizing these two problems and that each of the problems will then be a special case of this theory. The present paper is a survey of various problems in automatic control theory which, to some extent, develop a theory generalizing the \(H_2\)- and \(H_{\infty } \)-controller design methods for linear systems.

The first part of this survey deals with various theories constructed in the second half of the last century, which, to some extent, develop the classical statements in the \(H_2 \) and \(H_{\infty } \) automatic control theories.

The second part mainly deals with the robust stochastic control theory with anisotropy-based cost functional created by I.G. Vladimirov, who solved the problem of constructing a control theory lying in between the \(H_2 \)- and \(H_{\infty } \)-theories. Moreover, both of these theories are special (extreme) cases of the anisotropy-based control theory.

At the end of the survey, we consider minimax \(LQG \) control problems, where the ideas are close to those used when constructing an anisotropy-based control. The statement of minimax \(LQG \) control problems uses an information-theoretic characteristic—relative entropy—of two random signals as well. However, in contrast to the anisotropy-based theory, where the concept of relative entropy is employed to form a performance criterion for the control system, this signal characteristic is used in the minimax \(LQG \) control to describe the constraints in the system.

Because of the similarity of control and filtering problems [147, 149, 150], the authors decided not to include the papers on filtering with \(H_2 \), \(H_{\infty }\), and anisotropy-based criteria in this survey, because this would have considerably increased the length of the paper without offering any essentially novel ideas. The statements and solutions of anisotropy-based filtering problems can be found in [62, 63]. The ideas of \(H_{\infty } \)-filtering are described, for example, in [9, 10]. The problems of mixed \(H_2/H_{\infty } \)-filtering, implying the \(H_2 \)-norm as a minimization criterion and providing a given level of rejection of disturbances, were solved in [247].

The anisotropy-based control theory relies on information-theoretic ideas for describing the uncertainty of signals in control systems based on entropy concept. Recently, much attention has been paid in various papers to the information-theoretic description of uncertainty. In particular, the paper [46] deals with minimizing and maximizing relative entropy in various disciplines, but it does not say a word about describing the uncertainties of signals in control systems in terms of relative entropy. This survey aims to bridge the gap.

In a vivid expression from the introduction to the book [108], information theory answers two fundamental questions, one about the ultimate data compression (the answer is entropy) and the other about the ultimate data transmission rate (the answer is bandwidth). Likewise, control theory provides two basic requirements for control systems: constructing stable systems and guaranteeing the prescribed performance. Both information theory and control theory deal with models of signals. In control theory, signals are mainly viewed as the inputs and outputs of the control systems (plant plus controller) that the theory aims to create. In information theory, signals and their characteristics are the main case of study.

Information theory mainly studies signals of the stochastic nature. The introduction of a probabilistic description into control theory has brought the models of control systems closer to real technical systems. Probabilistic terms allow one to describe both external signals acting on a plant and internal signals. Accordingly, the uncertainties in the system must be described in a similar way.

The description of control systems using probabilistic characteristics already implies some uncertainty. According to A.A. Krasovskii [33], “statistical consideration allows one to build a kind of bridge from the dynamics of systems to an information description of processes in control systems. This bridge is that transient processes are described in information terms.” However, one can insert uncertainties in the description of the characteristics of a random process occurring in the control system. It is well known that the most complete characteristic of a random process is the probability distribution density. The unknown characteristics of the probability distribution density of a random input signal or random initial conditions are uncertainties in the probabilistic description of the system, while various models for describing unknown density characteristics allow one to state and solve meaningful problems of control theory in the presence of uncertainty [3]. For example, the study of parameter estimation and filtering problems for stochastic processes and stochastic control in G.P. Tartakovsky’s book [60] includes the case where the problems involve an a priori uncertainty.

Ideas for the application of information theory began to appear in the 1960s. In his book [66], A.A. Feldbaum noted the promising prospects of applying information-theoretic methods in control theory. In the introduction to his book [57], A.V. Solodov pointed out the importance of using information characteristics of signals when stating control problems. The book describes the application of information characteristics to the assessment of automatic control systems and considers the throughput of systems in the presence of disturbances. However, no attempts were made at that time to introduce information characteristics in the control model description or performance criteria.

The relationship between information theory and control theory is twofold. The first direction is well described in the survey [2]. This is the introduction of research objects of information theory in the description of a plant (for example, a communication channel), which necessitates studying traditional problems of control theory with the modified models taken into account. The second direction of how information theory influences control theory is that the well-developed apparatus of information theory is used to describe the set of signals circulating in the system, including not only the input and output signals but also internal ones. Taking into account the theoretical and probabilistic characteristics of the input signals and signals inside the plant is the basis of anisotropy-based control theory, and the second part of the present paper is a survey of papers on this theory.

In the present survey, the papers are mainly considered in chronological order.

We use the following notation: \(\mathbb {Z}\) is the ring of integers, \( {\mathbb R}\) is the field of real numbers, \(\mathbb {C} \) is the field of complex numbers, \(\mathbf {E}\thinspace \) is the expectation, \(\mathbf {cov} \) is the covariance matrix, \(\otimes \) is the Kronecker product, \(\mathrm {diag}(i_1,\ldots , i_N)\) is the \(N\times N \) diagonal matrix with entries \(i_1,\ldots , i_N \) on the main diagonal, \(|\cdot | \) is the Euclidean norm of a vector, \(\mathbb {L}_2^m \) is the class of \(m \)-dimensional square summable random sequences, \(l_2 \) is the class of square summable random sequences, \(\mathfrak {L}_2^m\) is the class of \({{\mathbb R}}^m \)-dimensional random absolutely continuously distributed vectors with finite second moment, \(D(p\parallel q)\) is the relative entropy of, or the Kullback–Leibler divergence [34] between, two distributions \(p \) and \(q \), \(\mathbf {A}(w)\) is the anisotropy of a random vector \(w\), \(\overline {\mathbf {A}}(W) \) is the mean anisotropy of a random sequence \(W \), and \({| \! | \! |} F{| \! | \! |}_a \) is the anisotropic norm of a system \(F \) with mean anisotropy level \(a \) of the input sequence.

By \(H_2^{m\times m}\) we denote the Hardy space of analytic matrix functions \(G \) in the open unit disk \(\left \{ z \in \mathbb {C}^1:\ |z| < 1\right \}\) on the complex plane with finite \(H_2 \)-norm

$$ \|G\|_2= \left (\frac {1}{2\pi } \int \limits _{\Omega } \mathrm {tr} \Big ( \big ({\widehat G}(\omega )\big )^* {\widehat G}(\omega ) \Big ) d\omega \right )^{1/2} ,$$

where \( {\widehat G}(\omega ) = \lim _{r \to 1-0}\thinspace G\left (r\thinspace \mathrm{e}^{i\omega }\right ) \), \(\omega \in \Omega =[-\pi ; \pi ] \), is the angular boundary value of the function

$$ G(z) = \sum _{k=0}^{+\infty } g_k\thinspace z^k , $$
(1.1)

\(g_k\) is the pulse transient characteristic, and \((\cdot )^*\) stands for complex conjugation.

By \( H_{\infty }^{m\times m}\) (\(RH_{\infty }^{m\times m} \)) we denote the Hardy space of (regular) transfer functions \(H(z) \) of a discrete-time system, analytic in the open unit disk, with the norm

$$ \|H\|_{\infty } = \sup _{|z| < 1}\thinspace \overline {\sigma }(H(z)),$$
(1.2)

where \(\overline {\sigma }(\cdot ) \) is the maximum singular value of a matrix.

2. OVERVIEW OF MODERN APPROACHES TO EXTERNAL DISTURBANCES REJECTION IN LINEAR SYSTEMS

In the 1950s, based on the fundamental paper [32] on the theory of linear discrete-time filtering published by Academician A.N. Kolmogorov in 1941 and on a similar theory independently developed by the great American mathematician N. Wiener, who considered the problems of linear filtering of signals as well as of their extrapolation and interpolation in continuous time [233], there appeared studies on how to apply probabilistic methods to filtering theory and later to automatic control theory. For more details, see [19].

Previously, as a rule, the signals occurring in the control system had been assumed to be deterministic. The Wiener–Kolmogorov theory relied on the spectral theory of random processes originating from A.Ya. Khinchin’s fundamental paper [68], where he established that the correlation function of a random process and its energy power spectrum are related by the Fourier transform. The theory set forth in Wiener’s book was very difficult to comprehend by engineers, who did not have relevant mathematical training in these years. In 1950, the famous American scientists H.W. Bode and C.E. Shannon used intuitive considerations to give a simplified presentation of this theory [98]. A completely new approach to the isolation of a signal from the noise, completely different from the Kolmogorov–Wiener theory and also applicable to optimal linear filtering of signals, was proposed by R.L. Stratonovich [58] in 1959. His theory was based on the representation of random processes simulating both the useful signal and the noise by differential equations (state equations). Independently of Stratonovich, definitive results on discrete- and continuous-time optimal linear filtering were obtained in 1961 by American scientists R.E. Kalman and R.S. Bucy [149, 150]. For Gaussian and Markov random processes, Stratonovich, Kalman, and Bucy derived differential equations determining the structure of an optimal filter whose input receives the signal and a matrix Riccati equation determining the estimation accuracy. The estimation equations are differential rather than integral ones, which is a practical advantage, because differential equations are much easier to solve than integral equations using analog or digital hardware.

The introduction of signals with probabilistic characteristics in the control theory made it possible to state and solve a new class of control theory problems. One striking result of the time was the control theory for linear systems with a quadratic performance criterion. This theory provided a powerful tool for synthesis of multidimensional control systems. The \(LQG \) (linear-quadratic Gaussian) problem (Kalman [28, 148]) is a control design problem for a plant with linear dynamics caused by additive Gaussian noise and with a performance criterion that is the expectation of a positive semi-definite quadratic form and contains an interesting feature. The linear controller solving this problem turns out to be a linear function of state and is identical to the controller in the problem in which there is no Gaussian noise. Such a problem is called an \(LQR \) controller design problem (where the abbreviation \(LQR \) stands for “linear-quadratic regulator”). This problem was solved by A.M. Letov [41,42,43,44].

Recall the statement and solution of the \(LQG \) optimization problem for the linear continuous-time dynamical system

$$ \dot {x}(t) = A(t)\thinspace x(t)+B(t)u(t)+v(t), $$
(2.1)
$$ y(t) = C(t)x(t)+w(t),$$
(2.2)

where \(x \) is the system state vector, \(u \) is the control vector, and \(y \) is the measured output used to construct the control. The system is also subjected to additive Gaussian white noises \(v(t) \) and \(w(t) \).Footnote 1 For a given system, one must find a sequence \( u(t)\) such that for each time \(t \) it linearly depends only on the previous values \(y(t^{\prime }) \), \(0\leqslant t^{\prime }<t \), and minimizes the performance criterion

$$ \begin {gathered} J={\mathbf {E}\thinspace }\left [x^\mathrm {T}(T)Fx(T)+ \int \limits _0^T\left (x^\mathrm {T}(t)Q(t)x(t) +u^\mathrm {T}(t)R(t)u(t)\right )dt\right ], \\ F\geqslant 0,\quad Q(t)\geqslant 0,\quad R(t)>0. \end {gathered} $$

The time horizon \(T \) may be finite or infinite. If \(T\rightarrow \infty \), then the first term \(x^\mathrm {T}(T)Fx(T) \) must be neglected. For the cost functional \(J \) not to tend to infinity in this case, it makes sense to consider the new functional \(\frac {J}{T}\).

The \(LQG\) controller solving this problem satisfies the equations

$$ \dot {\hat {x}}(t) = A(t)\hat {x}(t)+B(t)u(t)+L(t)(y(t)-C\hat {x}(t)),\quad \hat {x}(0)={\mathbf {E}\thinspace }[x(0)],$$
(2.3)
$$u(t) = -K(t)\hat {x}(t).$$
(2.4)

The matrix \(L(t) \) is called the Kalman gain matrix and is associated with the Kalman filter represented by (2.3). At each time, the filter generates an estimate \(\hat {x}(t)\) of the state \(x(t) \) using measurements and inputs. The gain matrix \(L(t) \) is determined by the matrices \(A(t) \), \(C(t)\), \(V(t) \), and \(W(t) \), the last two being the covariance matrices of \(v(t) \) and \(w(t) \), respectively, as well as by \({\mathbf {E}\thinspace }(x(0)x^\mathrm {T}(0))\). The Kalman gain matrix can be determined from the Riccati matrix differential equation

$$ \begin {gathered} \dot {P}(t)=A(t)P(t)+P(t)A^\mathrm {T}(t)-P(t)C^\mathrm {T}(t)W^{-1}(t)C(t)P(t)+V(t),\\[.2em] P(0) = {\mathbf {E}\thinspace }(x(0)x^\mathrm {T}(0)). \end {gathered}$$

Given the solution \(P(t) \), \(0\leqslant t\leqslant T \), the Kalman gain matrix is

$$ L(t)=P(t)C^\mathrm {T}(t)W^{-1}(t).$$

The feedback matrix \(K(t)\) is determined with the use of the matrices \(A(t)\), \(B(t) \), \(Q(t)\), \(R(t) \), and \(F \) and can be found from the Riccati matrix differential equation

$$ -\dot {S}(t)=A^\mathrm {T}(t)S(t)+S(t)A(t)-S(t)B(t)R^{-1}(t)B^\mathrm {T}(t)S(t)+Q(t),\\[.2em] S(T) = F.$$

Once \(S(t) \), \(0\leqslant t\leqslant T \), has been found, the matrix \(K(t) \) can be computed as \(R^{-1}(t)B^\mathrm {T}(t)S(t) \).

The resulting Riccati matrix equations are very similar except that the first one is solved in direct time and the second, in reverse time. The first equation allows one to find a solution of the linear-quadratic estimation (\(LQE\)) problem, and the second one, of the problem of finding a linear-quadratic controller (\(LQR \) problem). These two problems together are the linear-quadratic Gaussian (\(LQG\)) control problem. The \(LQE \) and \(LQR \) problems can be solved separately. This idea is known as the separation principle.

If the matrices \(A(t)\), \(B(t) \), \(C(t) \), \(Q(t) \), \(R(t) \), \(V(t) \), and \(W(t) \) are time-invariant, then the control law becomes time-invariant as \( T\rightarrow \infty \), and the dynamic Riccati equations can be replaced with algebraic ones [54, 55].

2.1. \( LQG\), \(LEQG \), and Risk-Sensitive Controllers

Let us give the statement and solution scheme of the \(LQG \) problem for the linear time-varying control system described by the equations

$$ x_{k+1} = A_k x_k + B_k u_k + w_k, $$
(2.5)
$$y_k = C_k x_k + v_k ,\quad 0\leqslant k < N, $$
(2.6)

with the zero initial condition \(x_0 = 0 \). Here \(k \) is the time index, and \(w_k \) and \(v_k \) are independent discrete-time Gaussian random processes with covariance matrices \(W_k\) and \(V_k \), respectively.

The cost functional is given by the expression

$$ J= \mathbf {E}\left (x^T_N F x_N + \sum _{k=0}^{N-1}\left (x_k^T Q_k x_k + u_k^T R_k u_k\right )\right ),$$
(2.7)

\(F\geqslant 0 \), \(Q_k\geqslant 0 \), \(R_k >0 \).

The problem is to find a controller stabilizing the closed-loop system and minimizing the cost functional (2.7).

The controller in the \(LQG\) problem is given by the formula

$$ u_k=-L_k \hat {x}_k,$$
(2.8)

where \(\hat {x}_k\) is an estimate for \(x_k \). The estimate \(\hat {x}_k \) of the state \(x_k \) is calculated using the solution of a difference Riccati equation. The feedback matrix \(L_k \) is determined using the solution of a difference Riccati equation as well. Thus, to obtain a control (2.8) minimizing the cost functional (2.7), one needs to solve two difference Riccati equations. If all the matrices in the problem statement are time-invariant, then the discrete \(LQG \) controller becomes time-invariant as the horizon \(N \) tends to infinity. In this case, the difference Riccati equations are replaced by the corresponding algebraic equations, with the equations for determining \(\hat {x}_k\) and \(L_k \) solvable independently, i.e., separately. In the \(LQG \)-control theory, the separation principle, which does not hold in deterministic systems and which is more formally known as the principle of separation of estimation and control, asserts that the problem of designing optimal \(LQG \) feedback controllers for a stochastic system can be solved by developing an optimal system state observer that is substituted into an optimal deterministic controller. Hence the problem splits into two separate parts, thus facilitating the synthesis. This separation principle, important in the \(LQG \) discrete-time problem, is described in detail, for example, in [48, 67].

In the time-invariant case, the performance criterion (2.7) used in the statement of the problem of seeking an optimal \(LQG \) controller coincides with the \(H_2 \)-norm of the transfer function of the closed-loop system. Thus, the solution of the optimal \(LQG\) problem presumes minimization of the \(H_2\)-norm of the closed-loop system. In this situation, the \(H_2\)-optimal control problem is solved.

Note that the above-described problem has uncertainties neither in the description of the plant model nor in the description of input disturbances, because the Gaussian input sequence is completely determined by its probability density function.

A huge amount of literature all over the world deals with the \(LQG \) and \(LQR \) problems. For an extensive bibliography, see [80]. In Russia, the most popular monograph was the monograph by H. Kwakernaak and R. Sivan [29], which sets out these problems for both continuous- and discrete-time control systems. The description of computational procedures can be found in [83, 220].

The first thematic issue of IEEE Transactions on Automatic Control was published in June 1971. This issue was completely devoted to various aspects of \(LQG \) control and, according to the journal editor, summarized the theoretical, algorithmic, and possible practical aspects of this problem [84]. However, even this issue already contained the papers [84, 194], which questioned the universality of this theory. According to the figurative expression of D.S. Bernstein [96] “the \(LQG\) theory and its technique of Riccati equations is very similar to a building consisting of only steel frames—very rigid and very limited.”

In practical \(LQG\) applications, the controller worked well enough if the additive noise was Gaussian white noise. However, if the input disturbance had a sufficiently large time covariance, i.e., the noise was not white, then the \(LQG \) controllers did not meet requirements for control systems closed by these controllers. It was intuitively clear that in the case of a large covariance of the input disturbance, other controllers should work better than the \(LQG \) ones. This reason put the control systems developers onto an idea to take into account the properties of random signals circulating in the control system when designing the controllers. One approach to account for the difference between the input random signal and white noise was the approach based on changing the performance criterion (optimality) for the control system.

This approach was first applied by Jacobson in 1973 in the paper [144]. In this article, the author first proposed to use the exponential-quadratic performance functional. The problem with a performance criterion including an exponential was called the \(LEQG\) (linear, exponential-quadratic, Gaussian) problem.

Jacobson [144] considered a linear time-invariant system with the measured state vector

$$ x_{k+1} = A x_{k} + B_1 w_k + B_2 u_{k}, \\ y_k = x_{k},$$

where \(x_k \) is the state vector, \(y_k \) is the measured output, \(u_k \) is the control vector, \(w_k \) is the Gaussian noise, and the constant matrices \(A \), \(B_1 \), and \(B_2 \) have appropriate sizes. He introduced the quadratic form

$$ G = x_N^\mathrm {T}\Pi x_N + \sum _{k=0}^{N-1}(x_k^\mathrm {T} Q_k x_k +u_k^\mathrm {T} R_k u_k). $$
(2.9)

The problem was to design a controller minimizing the functional

$$ \Upsilon _N= \sigma \mathbf {E}\left [\exp \left \{\sigma \frac {1}{2} G\right \}\right ], $$

where the parameter \(\sigma \) takes the values \(\pm 1 \). The value \(-1 \) corresponded to the so-called \(LE-G \) problem, and the value \(+1 \), to the \(LE+G \) setting. As the noise intensity goes to infinity, the optimal gains for the \(LE-G\) problem tend to zero; i.e., with such an input action, it is virtually impossible to reduce the value of the performance criterion by supplying a control signal. In the \(LE+G \) problem, the optimal controller ceases to exist if the noise intensity is sufficiently large (i.e., the performance criterion tends to infinity independently of the control input). The controller is given by the linear state function

$$ u_k = K(\Sigma _k)x_k $$

with the coefficient \(K(\Sigma _k) \) depending on \(\Sigma _k \), the covariance matrix of \(w_k \). Moreover, Jacobson was the first to show by direct calculations that the structure of the corresponding controller is the same as the structure of the controller for the control problem in a dynamic game. This interesting result for the first time established a connection between control problems in dynamic deterministic games and control problems based on minimizing stochastic performance functions.

With no noise present, the solution of this problem coincides with that of the \(LQR \) problem. However, in the presence of noise, the optimal controllers in Jacobson’s problem with an exponential performance criterion differs from the controllers in the \(LQG\) problem. Although, just as in the case of the \(LQG\) problem, these controllers are linear functions of the state variables, they necessarily depend on the covariance matrix of the additive Gaussian noise. The solutions of these problems are close for small covariances but differ noticeably for large ones.

We point out that Jacobson’s problem has no uncertainties in the description of the plant and the input disturbances.

The general case (the case of observing an incomplete state vector) had remained unsolved until P. Whittle [229] considered a plant model of the form

$$ \begin {aligned} x_{k+1} &= A x_{k} + B u_k + \varepsilon _k, \\ y_{k} &= C x_{k} + \eta _k, \end {aligned}$$

where \(\varepsilon _k \) and \(\eta _k \) are the input disturbance and the measurement noise, respectively. It was assumed that the sequence of vectors \(\{[\varepsilon _k^\mathrm {T}, \eta _k^\mathrm {T}]^\mathrm {T}\}\) is a Gaussian white noise with the joint covariance matrix

$$ \mathbf {cov} [\varepsilon _k^\mathrm {T}, \eta _k^\mathrm {T}] = \left [ \begin {array} {{ll}} N & 0 \\ 0 & M \end {array}\right ].$$

The following quadratic form was introduced:

$$ G = x_T^\mathrm {T}\Pi x_T + \sum _{k=0}^{T-1}(x_k^\mathrm {T} Q x_k +u_k^\mathrm {T} R u_k). $$

The cost functional was chosen in the form

$$ \gamma (\theta ) = -2\theta ^{-1}\log \mathbf {E}\left (e^{-1/2 \theta G}\right ).$$
(2.10)

The matrices \(N \), \(M \), \(R \), \(Q \), and \(\Pi \) are positive definite, and \(\theta \) is a real scalar.

If the quantity \(\theta \mathrm{Var} (G) \) is small (here \(\mathrm {Var}(G) \) is the variance), then \(\gamma (\theta ) \sim \mathbf {E}(G) - 1/4\theta \mathrm{Var} (G)\). This illustrates the fact that the cases of \(\theta =0\), \(\theta >0 \), and \(\theta <0 \) correspond to the risk-neutral, risk-greedy, and risk-uneager behavior, respectively, in the optimization problems [97]. Whittle called the parameter \(\theta \) the risk sensitivity parameter and showed that by choosing this parameter too large, one can arrive at a situation where the performance criterion can assume infinite values. It follows from the above reasoning that \(\gamma (0) \) is a traditional criterion and \({\mathbf {E}\thinspace }(G) \) is the limit value for \(\gamma (\theta ) \) as \(\theta \) tends to \(0 \) on both sides.

Whittle showed that the optimal controller is a linear function of the state estimate obtained using the modified Kalman filter.

The problem of designing a control by minimizing an exponential-quadratic functional (the problem of designing a risk-sensitive control, or the risk-sensitive problem) was studied in various interpretations in [93, 103, 104, 145, 229, 230, 232].

However, the \(LQG\) and \(LEQG \) theories were very limited in their applications to the synthesis of real technical control systems. The cause of the limited capabilities of the \(LQG \) theory was indicated in [114] in the late 1970s. The cause is that the controllers designed within the framework of this theory do not work well in the presence of uncertainties in model description unaccounted for in the synthesis even if these uncertainties are small. In modern terms, the control systems closed by the \(LQG \) controllers are not robust with respect to some uncertainties in the description of the plant. We point out that, along with the concept of uncertainty in the description of the plant model or input signals, the specialists in control theory should also consider the description of the class of uncertainties that must be determined in some way. By an uncertainty in the description of the plant we mean a parametric uncertainty in the description of the coefficients of the mathematical model, an unstructured (or structured) uncertainty in the description of the plant model, as well as an uncertainty in the form of the so-called \( M-\Delta \) configuration in accordance with the modern classification of uncertainties. More details about the description of uncertainties can be found in [6, 52, 53, 174, 191]. The uncertainty in the description of signals in the control system will be defined in information-theoretic terms later on in our paper.

2.2. \( H_{\infty }\)-Optimal and Suboptimal Controllers

Attempts to overcome the drawbacks in the controller design theory for linear systems with a quadratic performance criterion [115] have led to the revival of the frequency domain approach in the form of the \(H_{\infty } \)-optimization theory [250].

In his pioneer article [250], Zames proposed to use another performance criterion for designing controllers, namely, the \(H_{\infty } \)-norm of the closed-loop system. Involving this norm, in a sense, ensures the robust stability of the system. The idea of this method for ensuring robust stability is based on the well-known circular property of the induced operator norm (see, e.g., [25]) and on the relationship between the stability of the control system and the condition for ensuring the contractiveness of the operator of the system (its norm must be less than \(1 \)). If \(\|A\|_\mathrm {ind} \) is the induced norm of the operator \(A \), then

$$ \| A B\|_\mathrm{ind} \leqslant \| A\|_\mathrm{ind} \| B\|_{\rm ind}.$$
(2.11)

If we take the operator of the system for \(A \) and the uncertainty operator \(\Delta \) for \(B \), then the condition

$$ \| A\|_\mathrm{ind} \| \Delta \|_\mathrm{ind} < 1$$

ensures the stability of in series-connected operators of the system and the uncertainty. The “measure” of uncertainty for which the combined system remains stable is determined by the “measure” of the system,

$$ \| \Delta \|_{\rm ind} < 1/\|A \|_\mathrm{ind}.$$

If for the induced norm of the operator we take the induced \(l_2 \)-norm of the operator (i.e., the \(H_{\infty } \)-norm), then inequality (2.11) becomes

$$ \| A B\|_{\infty } \leqslant \| A\|_{\infty } \| B\|_{\infty }.$$
(2.12)

The latter inequality is closely related to the small gain theorem that was published in 1966 by Zames [248, 249].

Let us give a formal statement of the control design problem based on the criterion of the minimum of the \(H_{\infty }\)-norm of the closed-loop system.

Let the open-loop system \(F\) have an \(n \)-dimensional internal state \(x_k \) associated with an \(m_1 \)-dimensional disturbance \(w_k \), an \(m_2 \)-dimensional control \(u_k \), a \(p_1 \)-dimensional controllable signal \(z_k \), and a \(p_2 \)-dimensional measured output \(y_k \) and defined by the equations

$$ \begin {aligned} x_{k+1} &= A x_k + B_1 w_k + B_2 u_k, \\ z_k &= C_1 x_k + D_{{11}} w_k + D_{12} u_k, \\ y_k &= C_2 x_k + D_{21} w_k, \quad -\infty < k < +\infty , \end {aligned}$$
(2.13)

where \(A \), \(C_i \), \(B_j \), and \(D_{ij} \) are constant matrices of appropriate dimensions. The system \(F \) has the block structure

$$ F = \left [\begin {array}{{cc}} F_{{11}} & F_{12} \\[.2em] F_{21} & F_{{22}} \end {array}\right ] .$$
(2.14)

The system \(F \), as well as its subsystems \(F_{ij} \) in (2.14), has the following state space realizations:

$$ F \sim \left [ \begin {array}{{l|ll}} A & B_1 & B_2 \\ \hline C_1 & D_{{11}} & D_{12} \\ C_2 & D_{21} & 0 \end {array}\right ] . $$
(2.15)
Fig. 1.
figure 1

Lower fractional-linear transformation \({\cal L}(F,K)\) .

If the control signal \(U\) is formed based on the measured output \(Y\) by the controller \(K \), which is an admissible linear time-invariant (not necessarily stable) system, i.e., \(U = K \otimes Y\), then the transfer function from \(W\) to \(Z \) for the resulting closed-loop system is a lower fractional-linear transformation of the pair \((F,K)\) (see Fig. 1),

$$ {\cal L}(F,K) = F_{{11}} + F_{12}\thinspace K\thinspace \left (I_{p_2} - F_{{22}} K\right )^{-1} F_{21} .$$
(2.16)

The problem of design of an optimal \(H_{\infty } \)-control is to find a controller that ensures the minimum of the \( H_{\infty }\)-norm of the system closed by this controller from \(W \) to \(Z \). In other words, the optimal \(H_{\infty } \)-controller must ensure the condition

$$ ||{\cal L}(F,K)||_{\infty } \rightarrow \inf _K\limits .$$

In the frequency domain, the \(H_{\infty } \)-norm of a linear system can be interpreted as the maximum value of the system amplitude-frequency characteristic. It is well known that the solution of the problem of synthesis of an \(H_{\infty }\)-controller in the frequency domain can be reduced to searching for matrix transfer functions of the closed-loop system with an amplitude-frequency response characteristic more uniform over the entire frequency range [121]. In the Russian literature, such an interpretation of the \(H_{\infty }\)-norm is referred to as the uniform-frequency index [11].

Solving the optimal \(H_{\infty }\)-control problem is reduced to solving the model matching problem (fairly well known in the control theory) in the \( H_{\infty }\)-metric (the metric of Hardy spaces). For single-input single-output (SISO) systems, this solution is described in the monograph [118] by reducing the model matching problem to the Nevanlinna–Pick interpolation problem. For systems with multiple inputs and outputs (MIMO), solving the model matching problem and hence the \(H_{\infty } \)-optimal problem is reduced to the well-known Nehari extension problem. Design of \(H_{\infty }\)-optimal controllers by reducing this problem to the Nehari problem is described in [121] and also in [192]. In Russia, the solution of the \(H_{\infty } \)-optimization problem by solving the Nehari problem was described in [51].

Despite the attractiveness of \(H_{\infty }\) -controllers, the algorithms for optimal \(H_{\infty }\)-control synthesis were rather complicated for the understanding of engineers developing control systems at the turn of the 1980s. In addition, these algorithms had a major drawback in the eyes of the engineers: the optimal controller may well have an order much greater than the order of the system itself.

An essential point in the construction of the \(H_{\infty } \)-control theory was the transition from the optimal \(H_{\infty } \)-problem to the suboptimal one. The solution of the suboptimal \( H_{\infty }\)-problem in its most complete form in the state space for the continuous-time case was published in the famous “four authors’ paper” [117] and for the discrete-time case, in [139]. The solutions of the \(H_{\infty } \)-suboptimal control problem resemble those of the classical \(LQG \) problem. When reducing the solution of the \(H_{\infty } \)-suboptimal control problem to the solution of two Riccati equations, the computational complexity of the solution of the suboptimal problem turns out to be much lower than the solution of the optimal problem. The solution of the \(H_{\infty } \)-suboptimal problem using two Riccati equations was called the “2-Riccati approach.”

A solution of the discrete-time \(H_{\infty }\) -suboptimal control problem can always be obtained using the well-known transformation

$$ z=\frac {1+s}{1-s}. $$

This transformation takes functions analytic in a half-plane to functions that are analytic in the unit disk. Moreover, the Hankel norm and the \(H_{\infty } \)-norm of the transfer function are invariant under this transformation. For this reason, the suboptimal \(H_{\infty } \)-controller can be obtained using the following procedure. Transform a discrete-time plant \(G(z)\) into the corresponding continuous-time plant \(\widetilde {G}(s)=G\left (\frac {1+s}{1-s}\right ) \). Design a controller \(\widetilde {K}(s) \) in the continuous-time problem and transform it into a discrete-time controller using the inverse transformation. The procedure described above is well-posed theoretically. However, the complexity of its implementation is comparable with producing the desired equations in a straightforward manner. Moreover, the use of the transformation is impossible for systems having poles at the point \(-1 \). Another bilinear transformation is needed for such systems.

As was said above, the solution of the suboptimal \(H_{\infty } \)-control problem is reduced to solving two coupled Riccati equations that contain a parameter \(\gamma \) defining a constraint on the upper bound of the performance criterion (the \(H_{\infty } \)-norm of the closed-loop system); i.e.,

$$ \|T_{zw}\|_{\infty } \leqslant \gamma ,$$

where \(T_{zw} \) is the transfer function of the closed-loop system from the disturbing input to the controlled output.

Note that the following assertion, similar to the separation principle in the synthesis of an \( H_{2}\)-optimal controllers, holds for \(H_{\infty } \)-suboptimal controllers. An \(H_{\infty } \)-suboptimal output feedback controller is an output estimator in the case of a state-vector control law in the presence of the “worst-case” disturbance. This principle does not imply the possibility of separately solving the estimation problem and the control problem as in the case of the \(H_{2} \)-problem; however, for the worst disturbance equal to zero, the separation principle in the \(H_{\infty }\)-problem becomes a separation principle in the \(H_{2}\)-problem.

The solution of the \(H_{\infty }\) -suboptimal problem is part of various software packages for developing control systems, for example, the well-known Matlab Robust Control Toolbox [254].

2.3. Robust Stability (in \(H_{\infty }\)-Control Theory)

The most important goal in the design of control systems is to ensure the stability of the closed-loop systems. This is the minimum requirement for any controller. In practice, the behavior of the plant may differ from the behavior of its mathematical model (called the nominal plant). These differences can be caused by truncation or insufficient precision of devices for measuring system parameters, technological scatter of characteristics of the plant components, nonlinear or unmodeled dynamics, etc. The difference between the real plant and its nominal model is called modeling error, or system uncertainty. Due to the presence of uncertainties in the system, the controller to be synthesized must stabilize not only the nominal plant but also a set of systems forming a region of uncertainty around the nominal model under the assumption that the real plant belongs to this set.

The necessity to stabilize the system with uncertainty has determined the concept of robust stability: a closed-loop system remains stable in the presence of uncertainties from some set known in advance. The methods of \(H_{\infty }\)-control theory have contributed to obtaining significant results in the field of robust stabilization of plants with uncertain parameters.

There are many ways to describe uncertainties in control systems. For most descriptions of these uncertainties, the authors recommend the paper [191], which lists and defines the main ways to describe uncertainties. A fairly common way of describing the measure of uncertainty is the \(L_{\infty } \)-norm.

In \(H_{\infty }\)-control theory, it is conventional to model uncertainty by a transfer function separated from the transfer function of the nominal plant. This approach was first used for additive and multiplicative uncertainties in [106, 116] and for uncertainties in the form of coprime factors in [222, 223].

Using the lower fractional-linear transformation (2.16) of the pair \((F,\thinspace \Delta ) \), we define the form of the matrix \(F \) for the main types of unstructured uncertainties following the monograph [132]:

  1. 1.

    An additive uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} 0 & I \\ I & G_O \\ \end {array} \right ]\), where \(G_O \) is the transfer function of the nominal plant.

  2. 2.

    An inverse additive uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} -G_O & G_O\\ -G_O & G_O \\ \end {array} \right ] \).

  3. 3.

    An input multiplicative uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} 0 & I\\ G_O & G_O \\ \end {array} \right ]\).

  4. 4.

    An output multiplicative uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} 0 & G_O\\ I & G_O \\ \end {array} \right ]\).

  5. 5.

    An inverse input multiplicative uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} -I & I\\ -G_O & G_O \\ \end {array} \right ] \).

  6. 6.

    An inverse output multiplicative uncertainty is associated with the matrix \(F=\left [ \begin {array}{{cc}} -I & G_O\\ -I & G_O \\ \end {array} \right ] \).

  7. 7.

    A left uncertainty in the form of coprime factors is associated with the matrix

    $$ F=\left [ \begin {array}{{cc}} \left [ \begin {array}{c} -{\widetilde {M}}^{-1} \\ 0 \\ \end {array} \right ] & \left [ \begin {array}{c} -G_O \\ I \\ \end {array} \right ]\\[1em] {\widetilde {M}}^{-1} & G_O \\ \end {array} \right ], $$

    where \(G_O={\widetilde {M}}^{-1}\widetilde {N} \) is the left coprime factorization of the nominal object, \(F =(\widetilde {M}+\widetilde {\Delta }_M)^{-1}(\widetilde {N}+\widetilde {\Delta }_N)\) is the transfer function of the disturbed plant, and \(\Delta =\left [\begin {array}{{cc}}\widetilde {\Delta }_M &\widetilde {\Delta }_N\end {array}\right ]\).

  8. 8.

    A right uncertainty in the form of coprime factors is associated with the matrix

    $$ F=\left [ \begin {array}{{cc}} \left [ \begin {array}{{cc}} -{\widetilde {M}}^{-1} & 0 \end {array} \right ] & {\widetilde {M}}^{-1}\\[.6em] \left [ \begin {array}{{cc}} -G_O & I \end {array} \right ] & G_O \\ \end {array} \right ], $$

    where \(G_O=\widetilde {N}{\widetilde {M}}^{-1} \), \(F=(\widetilde {N}+\widetilde {\Delta }_N)(\widetilde {M} +\widetilde {\Delta }_M)^{-1} \), and \(\Delta =\left [\begin {array}{c}\widetilde {\Delta }_M \\ \widetilde {\Delta }_N \\ \end {array}\right ] \).

Fig. 2.
figure 2

Upper fractional-linear transformation \({\cal U}(F,\Delta ). \)

Figure 2 schematically represents the so-called upper fractional-linear transformation, which is given by the formula

$$ {\cal U}(F,\Delta )=F_{{22}}+F_{21}\Delta (I_n-F_{{11}}\Delta )^{-1}F_{12}.$$
(2.17)

For an invertible \((I_n-F_{{11}}\Delta ) \), a system with three uncertainties (additive, multiplicative, and in the form of coprime factors) can be represented as an upper fractional-linear transformation of the so-called standard plant \(F\) [174] represented in the block form (2.14) and of the uncertainty \(\Delta \).

In all the above-described uncertainty types, it is assumed that \(\Delta \) does not possess any certain structure. Now let us consider a structured uncertainty that includes an unmodeled dynamics (unstructured uncertainty) and a parametric uncertainty. In this case, the system with uncertainty can be represented by the upper fractional-linear transformation (2.17) with

$$ \Delta = \mathrm {diag} \left (\delta _1 I_{r_1},\thinspace \ldots ,\thinspace \delta _s I_{r_s},\thinspace \Delta _1,\thinspace \ldots ,\thinspace \Delta _f\right ), $$

where \(\delta _i\in \mathbb {C},\;\Delta _i\in \mathbb {C}^{h_j\times h_j},\) \(\Sigma _{i=1}^{s}r_i+\Sigma _{j=1}^{f}h_j=n\), and \(n \) is the order of \(\Delta \).

Necessary and sufficient conditions for the robust stabilizability of the nominal plant in the presence of uncertainties are given in [174]. This condition assumes performing a test for the \(H_{\infty } \)-norm of the system transfer function. The mathematical basis of the test for different types of uncertainties is the above-mentioned small gain theorem [248, 249] and the circular property of the induced norms (2.11) [25].

Robust controllers can be found using the procedure for solving the \(H_{\infty } \)-optimization problem posed in [250].

The present paper does not aim at a detailed presentation of the robust analysis and design problems, because there are a lot of papers devoted to these problems [120, 130, 205, 252]. Here it has been necessary to state the basic concepts of robust analysis, so that further, when presenting these questions in the framework of the anisotropy-based control theory, it would be possible to refer to the results obtained, in particular, in the \(H_{\infty }\)-control theory. More details about robust analysis and design problems can be found in the recently published survey [191].

The theory of robust (\(H_{\infty }\)-optimal and \(H_{\infty } \)-suboptimal) control for systems given in the state space was created in the late 1980s. Within the framework of this theory, a set of uncertainties that must be considered during control system design was clearly defined.

The necessity to describe classes of uncertainties has long been understood by specialists in control theory. There are many papers in which uncertainties in control theory are defined in some way, but here we are interested in uncertainties in the description of the system occurring in the \( H_{\infty }\)-optimization problem [100, 157, 228, 236, 240, 243].

Within the framework of these theories, it was possible to guarantee the robustness of the closed-loop system using the small gain theorem.

There are quite a few published monographs dealing with the \(H_{\infty } \)-optimal and \(H_{\infty } \)-suboptimal control theories both abroad and in Russia. It is worth mentioning some books published abroad [118, 121, 130, 205, 252]. The surveys [51, 76, 77], published in Russian, have now become a bibliographic rarity. The \( H_{\infty }\)-optimal and \(H_{\infty } \)-suboptimal controls are mentioned in the monographs [8, 35, 52, 78].

2.4. \( H_{2}\)-Suboptimal and Robust Controllers

It would be natural for the statement of the \(H_{2} \)-suboptimal control design problems to arise in the early 1960s, following the statements and solutions of the \(H_{2} \)-optimal problems [41, 148]. However, historically, the extensive study of suboptimal and robust \(H_{2}\)-controllers started much later, in the 1990s. To the authors’ opinion, it was impacted by two factors. The first is the arising numerical methods for solving linear matrix inequalities [101, 184] and convex optimization problems [102], and the second is the development of methods and techniques for solving robust and suboptimal \(H_{\infty }\)-control problems [101]. Therefore, we placed the section devoted to \(H_{2}\)-suboptimal robust controllers after the section devoted to \(H_{\infty }\) -controllers.

As pointed out in the introduction to the paper [170], “in a practical problem it may turn out that the optimal \(H_2 \)-controller may not exist for a given specific plant. That is, the given plant cannot satisfy the necessary and sufficient conditions for the existence of an optimal \(H_2 \)-control” (for more detailed information, see, e.g., [195]). Then the developer is forced to find a suboptimal controller. In the absence of a formal definition of suboptimal controller, any controller that provides the internal stability of a closed-loop system can be interpreted as a suboptimal one. However, it is natural to define suboptimality through the \(H_2 \)-norm (or any specified norm) of the selected transfer function reaching the constraint

$$ \|T_{zw}\|_{2}\leqslant \gamma , $$

where \(T_{zw}\) is the matrix of transfer functions of the closed-loop system from \(W \) to \(Z \) and \(\gamma \) is an admissible value of the disturbance gain.

The book [195] provides a brief description of the suboptimal \(H_2\) -control problem. This book solves the control design problem in which the \(H_2 \)-norm of the closed-loop system is arbitrary close to the optimal value. The theoretical solution of this problem is provided using the perturbation theory.

We list the results of only some papers dealing with \(H_2 \)-control synthesis. In [170], a suboptimal state feedback control is constructed with the use of three distinct differential estimators (forecasting, estimation of the current state, and a reduced order estimator) for discrete time-invariant systems. A dynamic compensator for discrete-time systems providing stability and the desired system performance was developed in [154]. The problems of robust \(H_2 \)-estimation for systems with norm bounded and polytopic uncertainties were considered in [235]. Output control for systems with uncertainties was proposed in [197].

The survey offered to the reader can in no way be regarded as an exposition of the \(LQG \), \(H_2 \), \(H_{\infty } \), or any other theories related to these problems, because it is solely intended to present the actual papers that have appeared in this direction. Nevertheless, we have had to state the above-mentioned problems.

2.5. Model Reduction in \(H_{2}\)- and \(H_{\infty } \)-Control Theories

The solution of the problems described above is reduced to finding an optimal (suboptimal) controller of full order equal to the order of the plant model. In technical applications, it is often necessary to design a controller of a reduced (given) order lower than the order of the model. We point out that the problem of synthesizing such a controller is rather difficult, because often, when solving it, the convexity of the resulting conditions with respect to the parameters of the controller is violated. The methods for synthesizing reduced-order controllers are divided into direct and indirect ones. When using the direct methods, the parameters of the reduced-order controller are immediately calculated using an optimization procedure or some other procedure. In the indirect approaches, either a full-order controller is first constructed, which is subsequently reduced, or the plant model is first reduced, a full-order controller is constructed for the reduced model, and then this controller is used to control the original model. Let us dwell on the indirect methods in more detail.

Reducing the plant model is one of the classical control theory problems and has been dealt with in many publications, for example, [86, 125, 128, 134, 136, 137, 146, 169, 175, 177, 182, 234, 242, 251].

The existing reduction methods can be arranged into three main trends. The first trend includes methods based on cutting away or discarding some of the equations describing the system [146, 175, 177, 182]. The papers [146, 182] reduce the order by cutting away \(LQG \)- and \(H_\infty \)-controllers, respectively. The reduction is applied to closed-loop optimal systems, and the solutions of the algebraic filtering and control Riccati equations corresponding to the \(LQG\) and \(H_\infty \)-problems are reduced to diagonal form. Although the methods in [146, 175, 177, 182] have been developed for continuous-time systems, they can readily be modified for the case of discrete-time systems [252]. It should be noted that the cut-away technique leads to some loss in the performance of the closed-loop system and imposes constraints on the reduced order of the controller. These constraints are related to the possible instability of the full-order system enclosed by the reduced-order controller.

The second trend is formed by methods for the optimal approximation to a linear system by a reduced-order model using various performance criteria [86, 134, 136, 137, 169, 182, 234, 242]. For the criteria one can take, for example, the quadratic approximation error criterion [134, 137, 234], the \(H_2 \)-norm of the approximation error model [136, 242], the \(H_\infty \)-norm [86], and other criteria [169].

Finally, the third trend incorporates combined reduction and optimal approximation methods. These include, for example, the optimal approximation to linear time-invariant systems in the Hankel norm proposed in [125] and developed for approximation in certain frequency ranges in [134]. The paper [128] considers methods for the reduction of \(H_\infty \)-controllers that allow maintaining a constant value of the \(H_\infty \)-norm of the closed-loop system with a reduced-order controller and ensuring the stability of this system. These methods are based on discarding coprime factors of the transfer function of the controller with the subsequent approximation of the reduced factors by fractional-rational functions.

3. GENERAL PARADIGM OF \(H_{2}\)- AND \(H_{\infty } \)-CONTROL THEORIES

The best-known methods for external disturbance rejection in linear time-invariant systems are the \(H_{2} \)- and \(H_{\infty } \)-approaches, in which the performance criterion of the closed-loop system is the norm of the disturbance–to–controlled-output transfer function. It is worth noting that the \(H_{\infty }\)-norm is an induced norm, while the \(H_{2}\)-norm is not. In the \(H_{2} \)-problem, the disturbance is always of the specific form, namely, a Gaussian white noise with zero mean and identity covariance matrix, whereas in the \(H_{\infty } \)-problem, it is the worst. In the case of \(H_{\infty } \)-control, the maximum (over the entire frequency range) norm of the transfer matrix (as the gain of the external disturbance) is minimized.

Fig. 3.
figure 3

System \(F \) closed by controller \(K \).

In view of the fact that the \(H_{\infty }\)-theory works with a wide set of disturbances, various control problem can be stated and solved within the framework of this theory. Let us consider the statements of these problems for the closed-loop system depicted in Fig. 3. Here \(F \) is the plant, \(K \) is the control law to be designed, \(r \) is the reference input signal, \(y \) is the system output, \(u \) is the control, \(e \) is the error between the reference signal and the system output, \(d \) is the disturbance, and \(n \) is the measurement noise. The output, control, and error signals are generated as follows:

$$ \begin {aligned} y &= (I+FK)^{-1}FKr+(I+FK)^{-1}d-(I+FK)^{-1}GKn,\\ u &= K(I+FK)^{-1}r-K(I+FK)^{-1}d-K(I+FK)^{-1}n, \\ e &= (I+FK)^{-1}r-(I+FK)^{-1}d-(I+FK)^{-1}n. \end {aligned} $$

We will assume that the signals \(r\), \(d \), and \(n \) are of limited energy and have been normalized, i.e., lie in the unit ball of the space \(\mathbb {L}_2\). However, the nature of these signals is not known precisely. Under the above assumptions, one can synthesize stabilizing controllers \(K\) for solving the following problems with the minimization of the \(H_{\infty } \)-norms of the corresponding systems:

  1. The tracking problem, \(\|(I+FK)^{-1}FK\|_{\infty } \).

  2. The rejection of external disturbances, \(\|(I+FK)^{-1}\|_{\infty } \).

  3. The suppression of noise, \(\|-(I+FK)^{-1}FK\|_{\infty } \).

  4. The reduction of the control energy, \(\|K(I+FK)^{-1}\|_{\infty } \).

Further, we use the notation \(LQG/H_2\) or simply \(H_2 \) to denote a problem that is more general than the \(LQG \) problem, namely, the \(H_2 \)-problem. One can read about the inclusion of the \(LQG \) problem in the \(H_2 \)-problem in various papers, e.g., [99].

The general paradigm of the \(LQG/H_{2}\) and \(H_{\infty } \)-control problems is depicted in Fig. 1.

Here \(F\) is the plant, \(K \) is the controller, \(W \) and \(Z \) are, respectively, the external input and the controlled output of the system, \(Y\) and \(U \) are the measured output and the control, and \(T_{zw} \) is the transfer function (the matrix of transfer functions) of the closed-loop system from \(W\) to \(Z \). In both problems, it is required to design a control that minimizes the performance criterion corresponding to the problem.

In what follows, we consider the discrete-time model of control systems.

It is convenient to present the general view of the \(H_{2} \)- and \(H_{\infty } \)-control theory problems in the light of the paradigm presented in Fig. 1 as various interpretations of the problem of rejecting external disturbances.

The standard \(H_{2}\)-optimization problem is to find a controller \(K\) (see Fig. 1) that

  1. Stabilizes the closed-loop system.

  2. Minimizes the \(H_{2}\)-norm of the transfer function (the matrix of transfer functions) \(T_{zw} \) of the closed-loop system from \(W \) to \(Z \); i.e.,

    $$ \|T_{zw}\|_2\rightarrow \min . $$
    (3.1)

The standard \(H_{\infty }\) -optimization problem is to find a controller \(K\) (see Fig. 1) that

  1. Stabilizes the closed-loop system.

  2. Minimizes the \(H_{\infty }\)-norm of the transfer function (the matrix of transfer functions) \(T_{zw} \) of the closed-loop system from \(W \) to \(Z \); i.e.,

    $$ \|T_{zw}\|_{\infty }=\sup _{|z| < 1}\thinspace \overline {\sigma }(T_{zw}(z))\rightarrow \min .$$
    (3.2)

The stated problem (just as any minimax problem) can be considered as an antagonistic two-player game in which the first player is the developer of the control system, selecting the controller \(K\), and the second is the nature (which maximizes the effect of the disturbance on the gain of the system) [87].

One usually considers a suboptimal \(H_{\infty } \)-problem, which differs from the above-stated optimal one by the requirement that

$$ \|T_{zw}\|_{\infty }\leqslant \gamma , $$
(3.3)

where \(\gamma \geqslant \gamma _{\mathrm {opt}} \). If the quantity \(\gamma \) is given in advance and it turns out that \(\gamma \leqslant \gamma _{\mathrm {opt}}\), then the suboptimal synthesis problem will have no solution.

The frequency interpretation of the \(H_{2} \)- and \(H_{\infty } \)-optimization problems for SISO systems is transparent. The \(H_{\infty }\) -controllers are synthesized so as to minimize the maximum value of the frequency response of the closed-loop system, while the \(H_{2}\)-control minimizes the average amplitude over all frequencies.

The theories of optimal and suboptimal control with the criterion of the minimum of \(H_{\infty } \)-norms of the closed-loop system ensure, among other goals, the validity of constraints on the external disturbance rejection [188]. It is remarkable that the \(H_{\infty } \)-suboptimal control theory is based on solving Riccati equations containing a certain parameter and is very similar to the theory of linear control design for linear systems with a quadratic performance criterion [117]. In the case where the value of this parameter tends to infinity, the equations for the synthesis of the \(H_{\infty } \)-suboptimal controller tend to the Riccati equations for the \(LQG \) problem. However, being minimax, i.e., calculated for the worst case of input disturbances, the \(H_{\infty }\)-optimal controllers have their natural drawbacks: to minimize the performance criterion, the control value sometimes becomes very large, and such systems are difficult to implement. In addition, systems with the \( H_{\infty }\) performance criterion are very conservative.

Suboptimal \(H_{\infty }\) -controllers that ensure that the induced norm of the operator of the closed-loop system is below the fixed boundary \(\gamma \) are not unique. Indeed, all controllers reaching a given boundary on the norm of the closed-loop transfer function can be expressed in terms of the fractional-linear transformation [193] of the controller \(K_c \) (known as the “central” [117]) and a free parameter \(Q\in H_{\infty } \), \(\|Q\|_{\infty }<1 \). Although each choice of the parameter \(Q \) guarantees a constraint on the norm, it is of interest to find the cases in which there exists a parameter \(Q \) for which the resulting controller minimizes the auxiliary cost functional for the closed-loop system. The natural choice of the auxiliary functional is the \(H_2 \)-norm of the transfer function of the closed-loop system. One of the drawbacks of \(H_{\infty }\)-control is the fact that the performance of the closed-loop system, usually associated with the \(H_2 \)-norm of the transfer function of the closed-loop system, is sacrificed to the robustness of the closed-loop system guaranteed by \(H_{\infty } \)-controllers. None of the concepts considered separately (the \( H_2\)- and \(H_{\infty } \)-control) is satisfactory from the engineering point of view. Developers prefer trade-offs.

These trade-offs between the merits and demerits of the \(LQG \) and \(H_{\infty } \)-theories can be divided into two strands. The first is the minimization of the \(H_2\)-norm of the closed-loop system with constraints on the \(H_{\infty }\)-norm, and the second, the minimization of the \(H_{\infty }\)-norm of the closed-loop system with simultaneous minimization of the upper bound of the cost functional used in the \(H_2 \)-optimal control problem.

4. TRADE-OFFS BETWEEN \(LQG\) AND \(H_{\infty } \)

One of the first papers related to the first of the above trade-offs was the paper by Bernstein and Haddad [95], which posed the problem of designing an \(LQG \) controller that simultaneously provides the constraints on the \(H_{\infty }\)-norm of the transfer function of the closed-loop systems. The method for synthesizing such controllers led to solving three coupled modified Riccati equations. The coupling of these equations illustrates the break in the separation principle for the \(LQG\) problem with the \(H_{\infty } \)-constraint. It is important to note that two of these three Riccati equations via the solution of which the optimal controller matrices are constructed are already known, because they coincide with the equations for solving the \(LEQG \) problem investigated in [93]. Thus, an explicit connection is traced between the \(LQG \) problem with a constraint on the \(H_{\infty } \)-norm of the transfer function of the closed-loop system and the \( LEQG\) problem.

Let us describe the range of issues related to the second strand of the trade-offs described above. As is well known, the \(H_{\infty }\) -suboptimal control is not unique and can be parametrized in some way [126]. However, if, on the set of suboptimal controllers, we pose the problem of maximizing the so-called \(H_{\infty }\)-entropy functional

$$ J(\gamma ,F) = - \frac {\gamma ^2}{2 \pi }\thinspace \int \limits _{-\infty }^{\infty } \ln \Big | \det \left ( I_m - \gamma ^{-2} \left ( F(j\omega ) \right )^* F(j\omega ) \right ) \Big |\thinspace d \omega , $$
(4.1)

where \(\gamma \) is the quantity bounding the \( H_{\infty }\)-norm of the transfer function of the closed-loop system \(F(s)\) similar to the entropy functional introduced by Arov and Krein in extension problems [4, 5], then the resulting controller (which is a solution of the suboptimal problem and maximizes the \(H_{\infty } \)-entropy functional) proves to be unique [178]. This controller is the so-called central controller. Moreover, it was shown in [126] that the problem of control design that assumes the minimization of \(H_{\infty } \)-entropy is, in a sense, equivalent to the problem of synthesizing a controller based on the criterion of minimizing the risk sensitivity functional (2.10). The same result was obtained by a different method in [231].

It is well known [180] that the \(H_{\infty } \)-entropy is a certain measure of the mismatch between the \( H_{2}\)- and \(H_{\infty } \)-optimality. Thus, the \(H_{\infty } \)-entropy is the upper bound of the \(H_{2} \)-norm of the closed-loop system, and minimizing the \(H_{\infty } \)-entropy integral thus leads to the achievement of the maximum value in the cost functional in the \(H_{2}\thinspace (LQG) \)-problem. Note that one can choose a continuous-time \(H_{\infty }\) -suboptimal controller that leads to an unbounded \(H_{2} \)-norm of the closed-loop system. The importance of the problem of minimizing the \(H_{\infty }\)-entropy functional attracted a lot of attention from specialists in \( H_{2} \)- and \(H_{\infty } \)-control, because there was hope that, by minimizing the \( H_{\infty }\)-functional, the developer at the same time minimizes the \(H_{2}\)-performance criterion. As far as the authors are aware, this fact has so far been neither proved nor refuted.

After the appearance of the papers [127, 178, 179], a number of studies concerning the design of controllers that have a constraint on the \(H_{\infty }\)-norm of the closed-loop system and maximize the \(H_{\infty }\)-entropy functional were published. For example, in [138], for discrete time, state space formulas were obtained for a controller ensuring that the \(H_{\infty }\)-norm of the closed-loop system is bounded by the parameter \(\gamma \) and minimizing the \(H_{\infty } \)-entropy functional. The solution is obtained by reformulating the problem in continuous time and applying a bilinear transformation that conformally transforms the unit disk into the left half-plane. In [181], a solution of the problem of minimizing the entropy functional alternative to [126] is given with the requirement that the \(H_{\infty }\)-norm of the closed-loop system be bounded and the system be stable. This solution is obtained by reducing the original problem using the Youla–Kučera parametrization (see, e.g., [130]) to the model tracking problem [121] and then further to the so-called distance problem. In [140], the problem of finding a controller that minimizes the entropy integral under a constraint on the \(H_\infty \)-norm of the closed-loop system is reduced to solving two auxiliary problems (the complete information and output estimation problems) and then applying the separation principle. The paper [239] considers the problem of synthesizing a static output controller that minimizes the \(H_{\infty }\)-entropy functional and provides a given constraint on the \(H_{\infty } \)-norm of the closed-loop system. The solution is reduced to solving coupled Riccati and Lyapunov equations. The approach is similar to that in [95].

For continuous singularly perturbed systems, a robust static output control minimizing the \( H_\infty \)-entropy functional of the closed-loop system was considered in [122]. It was required that the controller provides constraints on the \(H_\infty \)-norm of the closed-loop system and minimizes the \(H_\infty \)-entropy of the closed-loop system for sufficiently small values of the singular perturbation \(\varepsilon \). The optimal controller gain is synthesized based on coupled generalized Riccati and Lyapunov equations with symmetric \(2\times 2\) blocks. As \(\varepsilon \rightarrow 0\), the optimal controller tends to one of those minimizing the \(H_\infty \)-entropy of the closed-loop continuous-time system.

The concept of entropy used in \(H_{\infty }\) -optimization for discrete-time systems was extended to the time-varying case in [141]. This generalization is not trivial, because the \(H_{\infty } \)-entropy for a time-invariant system is defined in terms of the transfer function of the closed-loop system, which does not exist for time-varying systems. The entropy for time-varying discrete-time systems was defined in terms of operator theory with the use of fundamental factorization theorems. In [186], the problem of designing a control that minimizes the entropy functional was solved for time-varying discrete-time systems, and in [187], a relationship between the problem of minimizing the entropy for time-varying systems and the problem of minimizing the risk sensitivity function was established for such systems.

The already mentioned paper by Bernstein and Haddad [95], which became the first attempt to find a trade-off between robust stability (it is a well-known fact—the small gain theorem—that the value of the \( H_\infty \)-norm is responsible for the robust stability of a closed-loop system) and the rejection of a random disturbance in the form of white noise in accordance with the quadratic performance criterion (the value of the \(H_2 \)-norm), gave rise to a whole direction in robust control theory, referred to as mixed \(H_2/H_\infty \) -control. As noted in the book [107], such controllers represent the desired trade-off between \(H_2 \)- and \(H_\infty \)-control theories.

Fig. 4.
figure 4

Mixed \(H_2/H_\infty \)-control problem.

Let us briefly expose the paradigm of mixed \(H_2/H_\infty \)-control following [155] (see Fig. 4).

Here, just as in the previous figure, \(F \) is the plant, \(K \) is the controller, and \(T_{z_i w_i} \), \(i=0,1 \), is the transfer matrix of the closed-loop system from \(W_i \) to \(Z_i \). The mixed \(H_2/H_\infty \)-control problem is to find an internally stabilizing controller \(K \) minimizing \(\|T_{z_0 w_0}\|_{2} \) and ensuring that \(\|T_{z_1 w_1}\|_{\infty }<\gamma \).

The control problem considered in [95] is obtained from the mixed \(H_2/H_\infty \)-control problem with \(W_0=W_1=W \). Instead of minimizing \(\|T_{z_0 w}\|_{2} \), the paper [95] solves the problem of synthesizing an \(LQG \) controller with a constraint on the \(H_\infty \)-norm of the closed-loop system, which is called the mixed \( H_2/H_\infty \) performance criterion.

It was shown in [179] for the case of \( W_0=W_1=W\) and \(Z_0=Z_1=Z \) that the problem in [95] is equivalent to the entropy minimization problem [127].

The mixed \(H_2/H_\infty \)-control theory was further developed in [119, 253]. The prefaces to these papers say that one motivation for writing them was the desire to obtain a general setting for the \(H_2 \)- and \(H_\infty \)-optimization problems in the same way as the solutions of these problems are carried out according to a similar scheme (see [117]). Running ahead, we note that the general statement of the \( H_2\thinspace (LQG)\)- and \(H_\infty \)-optimization problems has been obtained within the framework of the anisotropy-based control theory described below. The problems posed and solved in [253] and [119] are somewhat dual to the results in [95]. In [176], the stochastic mixed \(H_2/H_\infty \)-problem was solved for the discrete-time case.

The recently published paper [30] proposes one approach to the creation of a general \(H_2/H_\infty \)-control theory for systems with deterministic inputs. The paper introduces a characteristic, called the \(H_{\infty }/\gamma _0 \)-norm, of the operator from the spaces of inputs to the space of outputs. This new norm essentially depends on the matrix \(R \) included in its definition. In the extreme cases, with one of the inputs missing, this norm turns into one of the remaining norms. The approach resembles the previously proposed one with the introduction of the gain in the form of the sum \(\lambda || \cdot ||_2+(1-\lambda )||\cdot ||_{\infty }\), \(\lambda \in [0,1] \), of weighted norms. In the indicated paper, an optimal state control law minimizing the \(H_{\infty }/\gamma _0 \)-norm was also synthesized.

We also note the papers [85, 185], which demonstrate a relationship between the problems of minimizing the \(H_{\infty }\)-entropy and the mixed \( H_2/H_\infty \)-control for time-invariant and time-varying control systems, respectively. In the present survey, the authors would not like to delve into the rigorous definition of the mixed \(H_{2}/H_{\infty }\) -control problem in the time-varying case; the reader can find the necessary information in the papers cited above.

5. ANISOTROPY-BASED CONTROL THEORY—EARLY DAYS

In this section, we briefly describe some notions of information theory necessary for the presentation of the foundations of anisotropy-based control theory; introduce the basic concepts of anisotropy-based control theory such as anisotropy of a random vector, mean anisotropy of the signal, and anisotropic norm of the system; and state the problem of optimal anisotropy-based control design and describe its solution.

5.1. Some Necessary Information on Information Theory

Let \(X \) be a discrete random variable with an alphabet \(\mathcal {X} \), and let a probability measure function \(p(x)=\mathrm {Pr}\{X=x\}\), \(x\in \mathcal {X} \), be given.

The entropy \(H(X) \) of the random variable \(X \) is defined as

$$ H(X)=-\sum _{ x\in \mathcal {X}} p(x) \log p(x)= - \mathbf {E}\left (\log p(x)\right ).$$
(5.1)

The entropy is a characteristic of a single random variable. Now suppose that two probability distributions \(p(x)\) and \(q(x) \) are given on one set.

The relative entropy, or the Kullback–Leibler divergence, between two distributions \(p(x) \) and \(q(x) \) is defined as

$$ D(p\parallel q) = \mathbf {E}_p\left (\log \frac {p(x)}{q(x)}\right ),$$
(5.2)

where \(\mathbf {E}_p(\Phi ) \) is the expectation of the function \(\Phi \) determined using the rule

$$ \mathbf {E}_p(\Phi )=\sum _{ x\in \mathcal {X}} p(x)\Phi (x).$$

Properties of relative entropy.

  1. Let \(p(x) \) and \(q(x) \), \(x\in \mathcal {X} \), be two probability distribution (two measures). Then

    $$ D(p\parallel q)\geqslant 0.$$

    The equality to zero is achieved if \(p(x)=q(x) \) for all \(x \).

  2. In the general case,

    $$ D(p\parallel q)\neq D(q\parallel p). $$

Consider two random variables \(X\) and \(Y \) with joint probability function \(p(x,y) \) and probability functions \(p(x) \) and \(p(y) \).

The mutual information \(I(X;Y) \) is the relative entropy between the joint distribution and the product \(p(x)p(y)\) of the distributions; i.e.,

$$ I(X;Y)=D(p(x,y)\parallel p(x)p(y)) = \sum _{ x\in \mathcal {X}}\sum _{ y\in \mathcal {Y}} p(x,y) \log \left (\frac {p(x,y)}{p(x)p(y)}\right ).$$
(5.3)

Let us define the differential entropy of a random variable, a concept that is important for the exposition to follow. Let \(X\) be a continuous \(m \)-dimensional random variable, and let \(f(x) \) be the probability distribution density for \(X \). The set where \(f(x)>0 \) is called the support set of \(X \).

The differential entropy \(h(X) \) for \(X \) with distribution density \(f(X) \) is defined as

$$ h(X)= - \mathbf {E}\left (\log f(X)\right )=-\int \limits _{S} f(X)\log f(X)\thinspace dX, $$

where \(S\) is the support set of the random variable.

Let us define relative entropy by analogy with the discrete case.

The relative entropy, or the Kullback–Leibler divergence \(D(f\parallel g) \), between the densities \(f(x)\) and \(g(x) \) is defined as

$$ D(f\parallel g)= \int \limits _{{\mathbb R}^m} f(x) \log \left (\frac {f(x)}{g(x)}\right )dx_1 \cdots dx_m. $$
(5.4)

Properties of relative entropy.

  1. \( D(f\parallel g)\) is finite if the support set of the function \(f(x) \) is contained in the support set of \(g(x) \).

  2. \( D(f\parallel g)\geqslant 0\), with the equality achieved if \(f=g \).

  3. The relation \(0 \log \frac {0}{0}=0 \) holds.

Let us define mutual information for continuous random variables. Let \(X \) and \(Y \) be two random \(m \)-dimensional variables with joint probability density distribution function \(f(x,y)\) and probability density functions \( f(x)\) and \(f(y) \).

The mutual information \(I(X;Y) \) is defined as

$$ I(X;Y) =D(f(x,y)\parallel f(x)f(y))={} \\[.3em] = \int \limits _{{\mathbb R}^{2m}} f(x,y) \log \left (\frac {f(x,y)}{f(x)f(y)}\right )dx_1 \cdots dx_m dy_1 \cdots dy_m. $$

The concept of relative entropy plays an important role not only in information theory (data compression) but also in other scientific disciplines such as statistical physics, probability theory, and financial mathematics [46]. In the next section, we will show what role this concept plays in control theory. More details about the definitions introduced and their properties can be found in [59, 108, 129].

The paper [198] published in 1988 was apparently the first paper that rigorously used information-theoretic concepts in the statement of control problems. In this paper, it was proposed to seek the optimal control from the condition of maximizing the differential entropy associated with the probability distribution function constructed on the set of controls. It was shown that optimizing the mean loss function is equivalent to minimizing the control entropy under the condition of the worst entropy density function.

In [151], relative entropy was proposed as a performance criterion in control design. This direction in control theory has been developing quite successfully [152, 153]. Note that the concept of anisotropy-based control, also based on the concept of relative entropy (the Kullback–Leibler divergence), was proposed by I.G. Vladimirov two years earlier in [201].

5.2. Key Definitions in Anisotropy-Based Theory

The concepts of anisotropy of a random vector, mean anisotropy of a sequence of random vectors, and anisotropic norm of a linear time-invariant system first appeared in [201] in 1994. The anisotropy of a random vector is defined as the minimum relative entropy (the Kullback–Leibler divergence) between the distribution density of a random vector and the distribution density of a Gaussian signal with zero mean and scalar covariance matrix [24]. The mean anisotropy of an infinite sequence of random vectors is defined via the anisotropy of a sequence element in the same way as the entropy per degree of freedom is defined (the term is borrowed from [60]). The entropy per degree of freedom is the limit of the ratio of entropy of \(n \) random variables to \(n \) as \(n \) tends to infinity. The entropy per degree of freedom is also called the time-invariant source entropy per message [31]. The analog of this concept in English is the term “entropy rate” [108]. The mean anisotropy is defined as the limit of the ratio of the anisotropy of a vector composed of \(n \) random vectors to \(n \) as \(n \) tends to infinity.

Definition of anisotropy of a random vector.

Recall that \(\mathfrak {L}_2^m\) is the class of \( {{\mathbb R}}^m\)-dimensional random absolutely continuously distributed vectors with finite second moment.

For each \(\lambda > 0\), by \(p_{m,\lambda } \) we denote the probability density function on \({{\mathbb R}}^m\) for a Gaussian signal with zero mean and scalar covariance matrix \(\lambda I_m\),

$$ p_{m,\lambda }(x) = (2\pi \lambda )^{-m/2} \exp \left (-\frac {|x|^2}{2\lambda }\right ), \qquad x \in {{\mathbb R}}^m. $$
(5.5)

For each \(w \in \mathfrak {L}_2^m \) with probability density function \(f: {{\mathbb R}}^m \to {{\mathbb R}}_+\), its relative entropy with respect to (5.5) takes the form [49]

$$ D \left ( f \parallel p_{m,\lambda } \right ) = \mathbf {E} \ln \frac {f(w)}{p_{m,\lambda }(x)} = -h(w) + \frac {m}{2} \ln (2\pi \lambda ) + \frac {{\mathbf {E}} |w|^2}{2\lambda },$$
(5.6)

where

$$ h(w) = -\mathbf {E} \ln f(w) = - \int \limits _{{\mathbb R}^m} f(w) \ln f(w) d w$$
(5.7)

is the differential entropy of the random vector \(w\).

Definition 1.

The anisotropy \( \mathbf {A}(w)\) of a random vector \(w \in \mathfrak {L}_2^m\) is defined as the minimum information deviation of its distribution from Gaussian distributions on \({{\mathbb R}}^m \) with zero mean and scalar covariance matrices,

$$ \mathbf {A}(w) = \min _{\lambda > 0} D \left ( f \parallel p_{m,\lambda } \right ).$$
(5.8)

A straightforward computation shows that the minimum in (5.6) over all possible \(\lambda > 0 \) is achieved at \(\lambda =\mathbf {E}|w|^2/m \), and consequently,

$$ \mathbf {A}(w) = \min _{\lambda > 0} D \left ( f \parallel p_{m,\lambda } \right ) = \frac {m}{2} \ln \left ( \frac {2\pi {\rm e}}{m}\thinspace \mathbf {E} |w|^2 \right ) - h(w). $$
(5.9)

Properties of anisotropy.

Let \(\mathbb {G}^m(\Sigma ) \) be the class of \({{\mathbb R}}^m \)-dimensional Gaussian random vectors \(w \) with \(\mathbf {E} w = 0 \) and \( \mathbf {cov}(w) = \Sigma \), \(\det \Sigma \neq 0 \), and let

$$ p(w) = (2\pi )^{-m/2} (\det \Sigma )^{-1/2} \exp \left (- \frac {1}{2}\|w\|_{\Sigma ^{-1}}^2\right ) $$

be the corresponding probability density distribution function, where \( \|x\|_Q = \sqrt {x^{\top } Q x}\) is the (semi)norm of a vector \(x\) induced by a positive (semi)definite symmetric matrix \(Q>0\). Then

(a) :

For each positive definite matrix \(\Sigma \in {{\mathbb R}}^{m\times m} \), one has

$$ \min _{w} \left \{ \mathbf {A}(w):\ w \in \mathfrak {L}_2^m,\ {\mathbf {E}} (ww^{\top }) = \Sigma \right \} = -\frac {1}{2} \ln \det \frac {m \Sigma }{\mathrm {tr} \Sigma },$$
(5.10)

and the minimum is attained only at \(w \in \mathbb {G}^m(\Sigma ) \).

(b) :

For each \( w \in \mathfrak {L}_2^m\), we have \(\mathbf {A}(w) \geqslant 0\), with \(\mathbf {A}(w) = 0 \) if and only if \(w \in \mathbb {G}^m(\lambda I_m)\).

(c) :

The anisotropy \(\mathbf {A}(w)\) is invariant with respect to the rotation and central dilatation of the vector \(w \); i.e., \(\mathbf {A}(\lambda U w) = \mathbf {A}(w)\) for each scalar \(\lambda \in \mathbb {R}\setminus \{0\}\) and each orthogonal matrix \(U \in \mathbb {R}^{m\times m}\).

Mean anisotropy of a Gaussian signal.

Let \(V = \left \{v_k\right \}_{k \in \mathbb {Z}} \) be a discrete \(m \)-dimensional Gaussian white noise with zero mean and identity covariance matrix,

$$ {\mathbf {E}\thinspace } v_k = 0 ,\quad {\mathbf {E}\thinspace } \left ( v_k v_k^T \right ) = I_m ,\quad -\infty < k < +\infty . $$

Consider an \(m \)-dimensional stationary Gaussian sequence

$$ W = \left \{w_k\right \}_{k \in \mathbb {Z}} = G \otimes V$$

generated from the white noise \(V\) by a shaping filter \(G \) with impulse response \(g_k \in {{\mathbb R}}^{m \times m}\), \(k \geqslant 0 \),

$$ w_j = \sum _{k=0}^{+\infty } g_k\thinspace v_{j-k} ,\quad -\infty < j < +\infty .$$

Such a filter is identified with its transfer function,

$$ G(z) = \sum _{k=0}^{+\infty } g_k\thinspace z^k ,$$

which is assumed to belong to the Hardy space \(H_2^{m\times m}\).

The sequence \(W\) has zero mean and spectral density

$$ S(\omega ) = \frac {1}{2\pi }\thinspace {\widehat G}(\omega ) \left ({\widehat G}(\omega )\right )^* ,\quad \omega \in \Omega =[-\pi ; \pi ] .$$
(5.11)

In the sequence \(W \), we isolate a subsequence of dimension \(N\times m \),

$$ W_{0:N-1} = \left [ \begin {array}{c} w_0\\ \vdots \\ w_{N-1} \end {array} \right ],$$

with each vector \(w_i \) belonging to \({{\mathbb R}}^m \) for \(i={0,\ldots ,(N-1)} \).

Definition 2.

The mean anisotropy of the sequence \(W\) is determined as follows [20]:

$$ {\overline {\mathbf {A}}}(W)=\lim \limits _{N\rightarrow +\infty } \frac {\mathbf {A}( W_{0:N-1} )}{N} .$$

It was also proved in [20] that the mean anisotropy of the sequence \(W = G \otimes V\) can be defined as

$$ {\overline {\mathbf {A}}}(G) = -\frac {1}{4\pi }\thinspace \int \limits _{\Omega } \ln \det \left (\frac {m}{\|G\|_2^2}\thinspace {\widehat G}(\omega ) \left ({\widehat G}(\omega )\right )^*\right )\thinspace d\omega .$$
(5.12)

The functional (5.12) possesses nonnegative finite values as long as the shaping filter \(G \in H_2^{m\times m}\) has the maximum (more precisely, full row) rank; i.e.,

$$ \mathrm{rank}\, {\widehat G}(\omega ) = m, \quad \text{for a.a.}\ \omega \in \Omega .$$

If, however, the filter \(G \) is not of maximum rank, then \({\overline {\mathbf {A}}}(G) = +\infty \). Note that \({\overline {\mathbf {A}}}(G) = 0\) if and only if the shaping filter \(G \) is all-pass system up to a nonzero constant factor, i.e., if the sequence \(W\) is the Gaussian white noise with scalar covariance matrix.

It can be seen that the mean anisotropy (5.12) is a characteristic of the probability distribution of the Gaussian sequence \(W = G\otimes V\) rather than its individual trajectories.

Note also that the mean anisotropy functional \({\overline {\mathbf {A}}}(G) \) is invariant under the transformation \(G \mapsto \alpha \thinspace U_1 G U_2\) with an arbitrary nonzero constant factor \(\alpha \in {{\mathbb R}} \) and arbitrary all-pass systems \(U_1, U_2 \in H_2^{m \times m} \) (for which the matrices \({\widehat U}_1(\omega ) \) and \({\widehat U}_2(\omega ) \) are unitary for almost all \(\omega \in \Omega \)). In particular, it follows that \({\overline {\mathbf {A}}}(G) \) is completely determined by the eigenvalue functions of the spectral density (5.11), which coincide, up to the constant factor \(\frac {1}{2\pi }\), with the squared singular value functions of \(\widehat G\).

There is a tight coupling between the mean anisotropy functional (5.12) and the information-theoretic approach to the quantitative description of chaos based on the Kolmogorov \(\epsilon \)-entropy of probability distributions [26, 65], on the one hand, and the principle of isotropy of a finite-dimensional Euclidean space, on the other. The reader who is interested in the details of this coupling is referred to the papers by I.G. Vladimirov [20, 56]. The above-mentioned relationships allow interpreting the mean anisotropy (5.12) in different (but genetically unified) ways.

Here we only note that (5.12) can be viewed as a quantitative complexity index of the generating Gaussian sequences \(W=G\otimes V \) under natural conditions.

Using the multidimensional version of the Kolmogorov–Szegő formula [27], a formula to calculate the mean anisotropy of a random sequence generated from Gaussian white noise by a shaping filter is obtained in [113]. The filter is defined by its representation \((A, B, C, D)\). To calculate the mean anisotropy, one has to obtain solutions of an algebraic Riccati equation and a Lyapunov equation, which include the matrices \(A, B, C, D \) of the shaping filter.

Based on the results in [69], the calculation of the mean anisotropy is reduced in [207] to solving linear matrix inequalities. The problem of propagation of mean anisotropy in various connections of linear filters was discussed in [163].

Within the framework of the anisotropy-based control theory, the inverse problem is also solved: given a level of mean anisotropy (spectral color) at the output of the shaping filter, construct the filter parameters. The algorithms for solving the inverse problem are described in [38, 166].

Anisotropic norm of the system.

Consider a stable linear discrete-time system \(F \) given in the state space in the form

$$ x_{k+1} = A x_k+B w_k,$$
(5.13)
$$ y_k = C x_k+D w_k,$$
(5.14)

where \(x_k \in \mathbb {R}^n\) is the system state vector, \(W=\{w_k\}_{k \in \mathbb {Z}}\) is a stationary Gaussian sequence of \(m \)-vectors with bounded level of mean anisotropy \({\overline {\mathbf {A}}}(W) \leqslant a\) ( \(a\geqslant 0 \)) and zero mean, and \(y_k \in \mathbb {R}^p \) is the system output.

Denote by \(Y=\{y_k\}_{k \in \mathbb {Z}} \) the output sequence of system (5.13)–(5.14). Let us define the power norm of the sequence \(Y\) by the formula

$$ \|Y\|_{_\mathcal {P}} = \sqrt {\lim \limits _{N\to \infty }\frac {1}{N}\sum _{k=0}^{N-1}{\mathbf {E}}|y_k|^{2}}.$$

Assuming that \(\|Y\|_{_\mathcal {P}} \) and \(\|W\|_{_\mathcal {P}} \) are finite, we define the root-mean-square gain (RMSG) for a given system \(F \) with input signal \(W=\{w_k\}_{k \in \mathbb {Z}} \) as

$$ Q(F,W) = \frac {\|Y\|_{_\mathcal {P}}}{\|W\|_{_\mathcal {P}}}.$$

Definition 3.

For a given value of \(a \geqslant 0\), the anisotropic norm of the system \(F\) is defined as

$${| \! | \! |} F {| \! | \! |}_{a} = \sup _{ {\overline {\mathbf {A}}}(W) \leqslant a}Q(F,W) . $$
(5.15)

Thus, the anisotropic norm of the system \({| \! | \! |} F{| \! | \! |}_a \) defines the stochastic gain of the input signal \(W \) by the system \(F \).

Properties of the anisotropic norm of the system.

For an \(F \in H_{\infty }^{p\times m} \), its \(a \)-anisotropic norm is a nondecreasing function of the parameter \(a \geqslant 0\) and satisfies the inequalities

$$ \|F\|_2/\sqrt {m} = {| \! | \! |} F{| \! | \! |}_0 \leqslant {| \! | \! |} F{| \! | \! |}_a \leqslant \lim _{a \to +\infty } {| \! | \! |} F{| \! | \! |}_a = \|F\|_{\infty }.$$
(5.16)

Calculating the norm \({| \! | \! |} F{| \! | \! |}_a \) for \(a > 0 \) is only of interest if

$$ \|F\|_2 < \sqrt {m}\thinspace \|F\|_{\infty }. $$
(5.17)

We have \(\|F\|_2 = \sqrt {m}\thinspace \|F\|_{\infty }\) if and only if \(F^{\top } F = \lambda I_m \) for some \(\lambda \geqslant 0 \). In particular, (5.17) holds if \(F \ne 0\) and \(p <m \).

The anisotropic norm does not possess the circular property (2.11); however, the following pseudomultiplicative property of the anisotropic norm was proved in [113]: for each \(a \geqslant 0\) and any systems \(F\in H_{\infty }^{p\times m} \) and \(G \in H_{\infty }^{m\times m} \), one has

$$ {| \! | \! |} FG{| \! | \! |}_a \leqslant {| \! | \! |} F{| \! | \! |}_b\thinspace {| \! | \! |} G{| \! | \! |}_a, $$
(5.18)

where

$$ b = a + {\overline {\mathbf {A}}}(G) + m\thinspace \ln \left ( \sqrt {m}\thinspace {| \! | \! |} G {| \! | \! |}_a/\|G\|_2 \right ). $$
(5.19)

In [22], the asymptotic behavior of the anisotropic norm is presented as the mean anisotropy level \(a \) tends to zero and infinity (as the anisotropic norm tends to the \(H_2 \)- and \(H_{\infty } \)-norm, respectively).

The papers [113, 224] contain formulas and a numerical algorithm for calculating the anisotropic norm in the frequency domain and in the state space. To calculate the anisotropic norm of the system \(F\) in the state space, it is necessary to solve matrix algebraic Riccati and Lyapunov equations as well as an algebraic equation of a special form. The former two equations include the parameters of the matrices \((A, B, C, D) \) of the representation of the system, and the latter equation includes the matrices that are solutions of the above algebraic Riccati and Lyapunov equations as well as the parameter of the level of the mean anisotropy of the input signal of the system \(F \).

A numerical algorithm for calculating the system anisotropic norm is based on the Newton iteration method and has a rather high computational complexity. To overcome this drawback, the later papers [70, 207] proposed new approaches to estimating the anisotropic norm by convex optimization methods. The result stated in [70] allows estimating the anisotropic norm of the system with application of modern computational packages for solving semidefinite programming problems.

5.3. Optimal Anisotropy-Based Problem and Its Solution

In [21, 201], the problem of synthesizing a controller minimizing the anisotropic norm of the closed-loop system was stated for time-invariant systems; this problem was solved in [225].

Fig. 5.
figure 5

Illustration of stochastic \( H_{\infty }\)-optimization problem.

Suppose that we are given a system \(F\) depicted in Fig. 5 with the state space realization (2.13). The external disturbance is a sequence of random vectors with mean anisotropy level \(a\).

Let us state the anisotropic optimization problem.

Problem 1.

For a given system (2.13) and a level \(a \geqslant 0 \) of mean anisotropy of the input disturbance \(W \), find a controller \(K \) that minimizes the \(a \)-anisotropic norm of the transfer function of the closed-loop system (2.16),

$$ {| \! | \! |} {\cal L}(F,K) {| \! | \! |}_a \equiv \sup \thinspace \left \{ \frac {\|{\cal L}(F,K) G\|_2}{\|G\|_2}:\ G \in \mathbb {G}_a \right \} \thinspace \rightarrow \thinspace \inf \limits _{ K \in {\mathbb {K}}}. $$
(5.20)

Here \(\mathbb {G}_a\) stands for the set of filters whose output contains a signal with mean anisotropy level less than or equal to \(a \), and \(\mathbb {K} \) is the set of stabilizing controllers.

By analogy with the \(H_2\)- and \(H_{\infty } \)-optimization problems (3.1) and (3.2), the anisotropy-based optimization problem can be reformulated as

$$ {| \! | \! |} T_{zw}{| \! | \! |}_{a} \rightarrow \min _K\limits .$$

This problem (just as the \(H_{\infty }\) -optimization problem) can be considered as an antagonistic two-player game in which the control (the controller \(K \)) is the first player and the second player is the shaping filter \(G \) generating the input disturbance \(W \). Introduce the sets

$$ \begin {aligned} \mathbb {K}_a^{\diamond }(G) &\doteq Arg \min _{K \in \mathbb {K}(F)}{\|{\cal L}(F,K)\|_2 },&\quad G &\in \mathbb {G}_{a}, \\[.3em] \mathbb {G}_a^{\diamond }(K) &\doteq Arg \max _{G \in \mathbb {G}_a}{\frac {\|{\cal L}(F,K)\|_2}{\|G\|_2} },&\quad K &\in \mathbb {K}(F). \end {aligned}$$

The scheme for solving the problem is as follows. If we are able to construct the “worst case” filter \(G \in \mathbb {G}^{\diamond }_a(K)\) that maximizes \(\frac {\| {\cal L}(F,K)\thinspace G\|_2}{\|G\|_2}\) for a system \(F \) and a given mean anisotropy level \(a \), then the problem is reduced to that of designing an optimal controller \(K\) that minimizes \(\|{\cal L}(F,K)\thinspace G\|_2\) with this worst case filter.

The design algorithm requires solving a system of equations consisting of three algebraic Riccati equations, a Lyapunov equation, and an algebraic equation of a special form. One algebraic matrix Riccati equation, the matrix Lyapunov equation, and the special form equation are needed to construct the worst case filter, and the remaining two algebraic Riccati equations play the same role as two algebraic equations in the \(LQG/H_2 \)- or \(H_{\infty } \)-theories. One of the equations is necessary to solve the problem of constructing an estimate \(\hat {x}\) for the state vector of the system, and the second is required to construct the control itself. Moreover, all these equations are coupled. The paper [172] describes a homotopy method for solving coupled Riccati equations.

A homotopy method for solving system of equations required for anisotropy-based controller design was developed in [112]. The description of how the homotopy method is applied in the anisotropy-based control theory, in particular, the use of the homotopy method with Newton iterations, is stated in [23].

As was identified above, both the \(H_{\infty } \)-optimal problem and the anisotropic optimal problem are game problems. Under inequality (5.16), the anisotropy-based controller takes an intermediate position between the \(LQG/H_2\) and \(H_{\infty } \)-controllers. This suggests that the \(LQG \) controller is minimax as well. This is true, because Gaussian white noise (the input signal of the system in the \(LQG \) problem) is the worst-case disturbance in terms of the criterion of maximum entropy of the input signal (see [108]).

5.4. Some Properties of Anisotropic Controllers

As was expected, control systems closed by anisotropy-based controllers are more robust than systems closed by \(H_2\)-controllers and less conservative than systems closed by \(H_{\infty }\) -controllers. In [23, 35, 159], the capabilities of \(H_2 \)-, \(H_{\infty } \)-, and anisotropy-based controllers in the problem of rejection of a wind shear type external disturbance during aircraft landing were considered. The anisotropy-based controllers have demonstrated significant advantages over the \(H_{\infty } \)-controllers in terms of control energy consumption. At the same time, the anisotropy-based controllers are more robust than the \(H_2 \)-controllers.

Recall that the \(LQG/H_2\)-optimal problem has a unique solution, while the solution of the \(H_{\infty } \)-optimal problem is nonunique. The solution of the optimal anisotropy-based problem is also unique.

Note a very important property of anisotropy-based controllers. By analogy with the above problem of designing \(H_{\infty }\) -controllers that minimize the entropy functional (4.1), there is also a relationship between the entropy functional and anisotropy. In [45], it was established how the entropy functional can be interpreted from the point of view of the information-theoretic approach. In this paper, it was shown that the anisotropy-based controller also minimizes the \(H_{\infty } \)-entropy of the closed-loop system for some fixed value of the parameter \(\gamma \).

Above, we considered the well-known separation principles in the \(LQG/H_2 \)-optimal control theory (see, e.g., [47, 48]) and in the \(H_{\infty } \)-suboptimal control theory (see, e.g., [117, 130]). It is remarkable that the separation principle can also be formulated in the anisotropy-based theory. In the anisotropy-based theory, this principle may read like this: “The optimal anisotropy-based full-order controller is the optimal estimator of the optimal control law in the problem with complete information about the system state vector for the case of the worst-case input.”

Note that the separation principle does not mean the independence of the Riccati equations. It is rather close to the principle of separation in the \(H_{\infty } \)-suboptimal control problem. Here the problem of synthesizing an estimator and the problem of synthesizing a static feedback controller cannot be solved independently of each other. For more information, see [225].

6. EVOLUTION OF ANISOTROPY-BASED THEORY

In this section, we focus on the evolution of a robust anisotropy-based control theory based on the definitions and concepts in the previous section since the beginning of the 2000s. Statements and solutions of the following problems will be presented: robust stability; anisotropy-based control design for systems with parametric uncertainties; suboptimal control design using convex optimization methods; developing an anisotropy-based theory of analysis and control for descriptor systems, systems with noncentered input signals, and time-varying systems; and solution of anisotropy-based filtering problems.

6.1. Robust Stability in Anisotropy-Based Theory

As was indicated in the properties of anisotropic norm, it does not possess the circular property (2.11). However, there exists an analog of this property. Based on relations (5.18) and (5.19), the papers [36, 37] investigated robust stability in the anisotropy-based theory.

Consider an operator \(F\) defining the input–output relations in the form

$$ \left [ \begin {array}{c} z_1 \\ z_2 \end {array}\right ] = \left [ \begin {array}{{cc}} F_{{11}} & F_{12} \\ F_{21} & F_{{22}} \end {array} \right ] \left [ \begin {array}{c} w_1 \\ w_2 \end {array} \right ]. $$
(6.1)

Here \(w_1\in {{{\mathbb R}}^m} \) and \(w_2\in {{{\mathbb R}}^p} \) are the system inputs, and \(z_1\in {{{\mathbb R}}^m} \) and \(z_2\in {{{\mathbb R}}^q} \) are the system outputs, with \(z_1 \) not necessarily being measurable.

System (6.1) is said to be internally stable if the matrix transfer function from the input \((w_1,w_2)^T\in {{\mathbb R}}^{m+p} \) to the output \((z_1,z_2)^T\in {{\mathbb R}}^{m+q}\) is asymptotically stable [130]; this is equivalent to the transfer functions \(F_{z_1w_1} \), \(F_{z_1w_2} \), \(F_{z_2w_1} \), and \(F_{z_2w_2} \) being analytic outside the unit disk. Here \(F_{z_iw_j} \) is the transfer function from the input \(w_i \) to the output \(z_j \), \(i,j = 1,2 \).

Problem 2.

For a given nominal plant \(F\), find the extent of changes in the uncertainty of its parameters \(\Delta \), quantitatively measured by the anisotropic norm, for which the system with uncertainty is internally stable.

The latter will imply the robustness of the plant \(F \) with respect to the uncertainty \(\Delta \).

Let us introduce the class of admissible uncertainties

$$ D_a({\epsilon })=\left \{ \Delta : \Delta \in RH^{m\times m}_\infty :\quad {| \! | \! |}\Delta {| \! | \! |}_a<\epsilon \right \}.$$

Here and in the following, \(\Delta (j\omega )\) will denote the limit

$$ \Delta (j\omega ) = \lim _{r \to 1-0}\thinspace \Delta \left (r\thinspace \mathrm{e}^{j\omega }\right ) ,\quad \omega \in [-\pi ; \pi ] .$$

We say that an uncertainty \(\Delta \) is admissible for the plant \(F \) if \(\Delta \in RH^{m\times m}_\infty \) and the system \(\mathcal {U}(F,\Delta ) \) of the form (2.17) is internally stable.

Fig. 6.
figure 6

\(F \)\(\Delta \) configuration.

Theorem 1.

Consider the system \( \mathcal {U}(F,\Delta )\) presented in Fig. 6, where \( \Delta :\mathfrak {L}_2^m\rightarrow \mathfrak {L}_2^m \) and \( F:\mathfrak {L}_2^{m+p}\rightarrow \mathfrak {L}_2^{m+q} \) are causal linear systems and the input–output relations are given by (6.1).

Suppose that

  1. \(F \) is stable and

    $$ {| \! | \! |} F_{{11}}{| \! | \! |}_c < \epsilon ^{-1}, $$
    (6.2)

    where \(c=a +m\ln \displaystyle \frac {{\epsilon }}{{\mathop \mathrm{ess\thinspace inf} \limits _{-\pi \leqslant \omega \leqslant \pi } }\underline \sigma (\Delta (j\omega ))} \) , \( \underline \sigma (\Delta ) =\sqrt {\lambda _{\min }(\Delta ^*\Delta )} \) is the minimum singular value of the operator \(\Delta \) , and \(\epsilon \) is some positive constant.

  2. The anisotropy level \(a \) is determined by the formula

    $$ a = -\displaystyle \frac {1}{2}\ln \det \frac {m\Sigma }{\mathrm {tr}\Sigma } - m\ln \frac {\epsilon }{\mathop {\mathrm {ess}\thinspace \mathrm {sup}}\limits _{-\pi \leqslant \omega \leqslant \pi }\underline \sigma (\Delta (j\omega ))},$$
    (6.3)

    where \(\Sigma = (I_m - qF_{{11}}^* F_{{11}})^{-1} \) and the parameter \( q\in [0,\|F_{{11}}\|_\infty ^{-2})\) satisfies the inequality

    $$ \displaystyle \mathrm {tr}\left [\left (I_m - {\epsilon }^2 F_{{11}}^*F_{{11}}\right ) \left ( I_m - qF_{{11}}^*F_{{11}}\right )^{-1}\right ] \leqslant 0.$$
    (6.4)

Then the closed-loop system \( \mathcal {U}(F,\Delta )\) is internally stable for all \(\Delta {\thinspace \in \thinspace } D_a({\epsilon })\) .

Theorem 1 provides sufficient conditions for the robustness of plants whose uncertainty is bounded in terms of the anisotropic norm. This theorem allows one to relax the conservative condition \(\|F_{{11}}\|_\infty <1/{\epsilon }\) of the small gain theorem by replacing it with condition (6.2). Here the anisotropy level \(a \) shows how much one can relax the conditions of the small gain theorem without losing robust stability.

One can specify how to find a boundary anisotropy level \(a \) providing the internal stability subject to (6.2) for a given implementation of the nominal plant. Finding a suitable anisotropy level is reduced to the problem of optimizing \(\max {q} \) under nonlinear constraints. One can read more about this in [37].

Based on the previous theorem, for a given system with various types of uncertainty (additive, multiplicative, or in the form of coprime factors) [130] one can find the possible spread of the parameters whose exact value is unknown. Let us provide an analog of the Glover–McFarlane theorem [174] on robust stabilizability [51] of a system with additive uncertainty.

Theorem 2.

Consider a system \( {\mathcal {L}}(F+\Delta ,K)\) with a nominal plant \(F \) , an additive uncertainty \(\Delta \) , and a controller \(K \) , where \( \Delta :l_2\rightarrow l_2\) stands for additive disturbances, \(F:l_2\rightarrow l_2 \) is the nominal plant, \( K:l_2\rightarrow l_2\) is the controller, and \(\Delta \) , \(F \) , and \(K \) are linear causal systems. Suppose also that the maximum singular condition number \( \psi ={\mathop \mathrm{ess\thinspace sup} \limits _{-\pi \leqslant \omega \leqslant \pi } }{\mathrm {cond}\thinspace }(\Delta ^*\Delta ) \) of the uncertainty is known. The controller \(K\) stabilizes \(F+\Delta \) if

  1. 1.

    The controller \(K \) stabilizes the nominal plant; i.e., the system \({\mathcal {L}}(F,K) \) is stable.

  2. 2.

    \(\Delta \in D_a\left (\displaystyle \frac {1}{{| \! | \! |} K(I-FK)^{-1} {| \! | \! |}_{a + m\ln \psi }}\right )\) for some \( a\in [0,\infty ).\)

Similar results were obtained in the anisotropy-based theory with a nonzero mean of the input disturbance [167].

6.2. Suboptimal Anisotropy-Based Control. Design of Reduced- and Given-Order Controllers

Similar to how optimal statements were followed by the statements and then solutions of suboptimal problems as well as by problems of designing controllers of reduced and prescribed orders in the \( H_{2}\)- and \(H_{\infty } \)-control theories, the same situation occurred in the anisotropy-based theory. Suboptimal controllers stabilize the closed-loop system and ensure that its anisotropic norm is bounded by a given value; i.e., they guarantee rejection of random external disturbances whose mean anisotropy does not exceed a certain specified level. In contrast to the synthesis of an optimal anisotropy-based controller, the solution of suboptimal problems leads to a set of controllers, leaving additional degrees of freedom for determining some additional requirements for the closed-loop system so as to achieve the desired control performance.

6.2.1. Bounded real lemma in anisotropy-based theory

The key step in the suboptimal anisotropy-based control theory is the bounded real lemma which has been formulated in terms of Riccati equations [164] as well as in terms of matrix inequalities [70].

Let us briefly expose the anisotropy-based bounded real lemma following [70].

The model of a linear discrete time-invariant system \(F {\thinspace \in \thinspace } H_{\infty }^{p \times m}\) with \(m \)-dimensional input \(W \), \(n \)-dimensional state \(X \), and \(p \)-dimensional output \(Z \) has the form

$$ \begin {bmatrix} x_{k+1}\\[.2em] z_k \end {bmatrix} = \begin {bmatrix} A & B\\[.2em] C & D \end {bmatrix} \begin {bmatrix} x_k\\[.2em] w_k \end {bmatrix},$$
(6.5)

where the dimensions of the real matrices \(A \), \(B \), \(C \), and \(D \) are consistent and \(A \) is a Schur matrix (\(\rho (A)<1 \)). The input sequence \(W \) is assumed to be a stationary sequence of Gaussian random vectors with a bounded mean anisotropy \(a\geqslant 0 \). The problem is as follows.

Problem 3.

Given a system \(F \), a mean anisotropy level of the input disturbance \(a\geqslant 0 \), and a real scalar \(\gamma >0 \), verify the condition \({| \! | \! |} F{| \! | \! |}_a<\gamma \), where \({| \! | \! |} F{| \! | \! |}_a \) is the anisotropic norm of the system \(F \) defined by (5.15).

Theorem 3.

Let \(F \in H_{\infty }^{p \times m} \) be a system with the state space realization (6.5). The \(a \)-anisotropic norm (5.15) of the system \(F\) is strictly bounded by a prescribed threshold value \(\gamma >0 \), i.e.,

$$ {| \! | \! |} F{| \! | \! |}_a < \gamma ,$$

if there exists an \(\eta > \gamma ^2 \) such that the inequality

$$ \eta -\Big (\mathrm {e}^{-2a}\det (\eta I_m-B^\mathrm {T}\Phi B-D^\mathrm {T} D)\Big )^{1/m} < \gamma ^2 $$
(6.6)

holds for a real \( (n\times n)\)-matrix \( \Phi =\Phi ^\mathrm {T}\succ 0\) satisfying the LMI

$$ \left [ \begin {array}{{cc}} A^\mathrm {T}\Phi A -\Phi + C^\mathrm {T} C & A^\mathrm {T}\Phi B + C^\mathrm {T} D\\[.4em] B^\mathrm {T}\Phi A + D^\mathrm {T} C & B^\mathrm {T}\Phi B + D^\mathrm {T} D - \eta I_m \end {array} \right ] < 0.$$
(6.7)

The system of inequalities (6.6) and (6.7) can be solved using freeware packages such as, for example, the one in [171] with the solver program [206] for the Matlab and Scilab systems. This method for calculating the \(a \)-anisotropic norm does not involve solution of a complicated system of cross-coupled equations using a computational algorithm based on the homotopy method.

As \(a\to +\infty \), the LMI (6.7) can be reduced to the form

$$ \begin {gathered} \left [ \begin {array}{{ccc}} A^\mathrm {T}\bar \Phi A-\bar \Phi & A^\mathrm {T} \bar \Phi B & C^\mathrm {T}\\[.3em] B^\mathrm {T} \bar \Phi A & B^\mathrm {T} \bar \Phi B-\gamma I_m & D^\mathrm {T}\\[.3em] C & D & -\gamma I_p \end {array} \right ] < 0, \end {gathered} $$
(6.8)

which is well known in the context of \(H_\infty \)-control for discrete-time systems (see, e.g., [111, 123]). This fact is closely related to the convergence \(\lim _{a\to +\infty }{{| \! | \! |} F{| \! | \! |}_a} =\|F\|_\infty \), and consequently, the inequality \({| \! | \! |} F{| \! | \! |}_a<\gamma \) “approximates”

$$ \|F\|_\infty <\gamma$$
(6.9)

for sufficiently large \( a.\) Thus, as \(a\to +\infty \), Theorem 3 becomes the bounded real lemma for the \(H_\infty \)-norm, establishing equivalence between the fulfillment of (6.9) and the existence of a positive definite solution of the LMI (6.8).

The bounded real lemma for the anisotropic norm in terms of inequalities is a key result that is used to solve problems of synthesizing anisotropy-based suboptimal controllers by convex optimization and semidefinite programming methods described in the next subsection.

6.2.2. Suboptimal anisotropy-based control design

The plant is represented by a linear discrete time-invariant model \(F \) with \(n_x \)-dimensional state \(X \), \(m_w \)-dimensional disturbance input \(W \), \(m_u \)-dimensional control input \(U \), \(p_z \)-dimensional controlled output \(Z \), and \(p_y \)-dimensional measured output \(Y \),

$$ F:\thinspace \thinspace \left [ \begin {array}{c} x_{k+1}\\ z_k\\ y_k \end {array} \right ] = \left [ \begin {array}{{ccc}} A & B_w & B_u\\ C_z & D_{zw} & D_{zu}\\ C_y & D_{yw} & 0 \end {array} \right ] \left [ \begin {array}{c} x_{k}\\ w_k\\ u_k \end {array} \right ], $$
(6.10)

where all matrices have compatible dimensions, \(p_z \leqslant m_w\), the pair of matrices \((A,B_u) \) is stabilizable, and the pair \((A,C_y) \) is detectable.

It is assumed that the mean anisotropy level (5.12) of the sequence \(W \) does not exceed a known nonnegative level \(a \).

An output feedback controller of a given order in the form of a dynamic compensator has the form

$$ K:\thinspace \thinspace \left [ \begin {array}{c} \xi _{k+1} \\ u_k \end {array} \right ] = \left [ \begin {array}{{cc}} A_{\mathrm {c}} & B_{\mathrm {c}} \\ C_{\mathrm {c}} & D_{\mathrm {c}} \end {array} \right ]\left [ \begin {array}{c} \xi _k \\ y_k \end {array} \right ], $$
(6.11)

where \(\xi _k\) is the \(n_\xi \)-dimensional state vector stabilizing the closed-loop system and guaranteeing a certain prescribed level of performance in rejection of external disturbances. It is assumed that the Kimura condition [158] of order \(n_\xi \),

$$ n_\xi > n_x-m_u-p_y,$$

is satisfied for the plant (6.10) and the controller (6.11). This condition guarantees the existence of a stabilizing controller of the prescribed order \(n_\xi \).

The general statement of the problem of synthesizing an anisotropy-based suboptimal controller of a given order is as follows.

Problem 4.

Given a plant \(F \) with the state space representation (6.10), a mean anisotropy level \(a\geqslant 0 \), an input disturbance \(W \), and some desired threshold value \(\gamma >0 \), find a linear discrete time-invariant output controller \(K \) with the state-space representation (6.11) stabilizing the closed-loop system and guaranteeing that its \(a \)-anisotropic norm does not exceed the threshold value \(\gamma \); i.e.,

$$ {| \! | \! |} T_{zw}{| \! | \! |}_a < \gamma . $$

In [72], within the framework of anisotropy-based control theory, a general solution of the fixed order controller design problem was obtained as well as three particular cases of structure of the plant and the controller were considered: state feedback controller for the plant with completely measurable state [212]; full order dynamic output feedback controller; and static output feedback controller. To solve the design problem, the criterion for checking the condition of the boundedness of the anisotropic norm of the system by a given threshold value for the model in the state space is applied. This criterion—the bounded real lemma for the anisotropic norm—was formulated in the previous section of the current survey. In [208, 209], a method for order reduction of the optimal anisotropy-based controller that is a solution of the problem of anisotropy-based stochastic \(H_\infty \) full-order optimization [210] is given. Based on convex optimization methods, in the paper [71], the problem of designing a fixed-order suboptimal anisotropy-based controller of a prescribed order in the form of a dynamic compensator was solved.

6.2.3. Multichannel anisotropy-based control problem

Within the framework of the suboptimal control design concept, the problem of synthesizing anisotropy-based control for a linear discrete time-invariant system in which certain input-output channels are combined in terms of the technical properties of the system or the proximity of signal properties is considered in [73].

Assume that \(N\) channel clusters of controllable outputs \(Z_j\) (consisting, in the minimum case, of one channel) are allocated in the controllable output vector \(Z \) of the plant (6.10) with respect to the technical design requirements. Let \(N \) channel groups of external inputs \(W_j \) be also selected in the external input vector \(W \). The identical groups of channels of controllable outputs \(Z_j = Z_i\) or external inputs \(W_j = W_i \) are considered to be different for \(j\neq i \). A similar division of the input and output of the plant into channel groups is proposed in [199, 200] in a much more general statement of the multichannel control problem. The division of channels into groups can be carried out in terms of the technical properties of the system (for example, reference signals/external disturbances/measurement noise) or the proximity of signal properties (for example, weakly/strongly correlated signals). For each of the groups of channels of external inputs \(W_j\), it is assumed that the mean anisotropy (5.12) of the sequence \(W_j \) does not exceed a known nonnegative level \(a_j \), \(j=1,\ldots ,N \). The anisotropy-based controller to be designed simultaneously provides specified levels of disturbance attenuation for certain groups of channels. Such problems are called multichannel problems.

Fig. 7.
figure 7

Closed-loop system in a multichannel problem.

Let \(T_{zw}(z) \) denote the matrix transfer function from the external input \(W \) to the controllable output \(Z \) of the closed-loop system with a dynamic output feedback compensator \(K\) of the given order (6.11) with the \(n_\xi \)-dimensional state \(\Xi =\{\xi _k\}_{k\in \mathbb {Z}}\), stabilizing the closed-loop system (Fig. 7) and ensuring a certain given level of rejection of external disturbances or a given performance in tracking reference signals. The matrix transfer function of the closed-loop system \(T_{zw}(z)\) is given by a lower fractional-linear transformation of the form (2.16) for the pair \((F,K) \). Then \(T_{z_jw_j}(z):=\mathcal {L}_j T_{zw}(z)\mathcal {R}_j\) is the matrix transfer function from the group of external inputs \(W_j\) to the group of controllable outputs \(Z_j\), \(j=1,\ldots , N \), where \(\mathcal {L}_j \) and \(\mathcal {R}_j \) are real matrices for selecting the groups of inputs and outputs, respectively. It is assumed that the Kimura condition \(n_\xi > n_x-m_u-p_y \) is satisfied for the plant (6.10) and the controller (6.11); this guarantees the existence of a stabilizing controller of the given order \( n_\xi \).

The statement of the multichannel problem of design of an anisotropy-based suboptimal controller of a given order is as follows.

Problem 5.

Given a plant \(F \) with the state space model (6.10), the known mean anisotropy levels \(a_j\geqslant 0 \) of the groups of external inputs \(W_j \), and some set of desired threshold values \(\gamma _j>0 \), \(j=1,\ldots , N \), find a linear discrete time-invariant output controller \(K \) with the state space model (6.11) that stabilizes the closed-loop system and ensures the simultaneous fulfillment of the conditions

$$ {| \! | \! |} T_{z_j w_j}{| \! | \! |}_{a_j} < \gamma _j. $$
(6.12)

The solution of multichannel problems is of considerable practical importance. The developed technique for anisotropy-based control design in multichannel problems was applied to solving the multichannel problem of anisotropy-based control of the angular position of a gyro-stabilized platform with a variable angular momentum of a gyro unit subjected to external disturbances under measurement noise conditions [211].

6.3. Synthesis of Anisotropic Controllers for Plants with Parametric Uncertainty

It is natural to pose the anisotropy-based control problem for systems in which the plant is subjected to some disturbances, in particular, parametric ones, i.e., to pose a synthesis problem similar to the \( H_{2}\)-control problem for perturbed plants mentioned above in Sec. 2.4.

The details of this section are published in [37]. Consider a linear discrete time-invariant system \(F \) described by the equations

$$ \begin {aligned} x_{k+1} &= \left (A + F_1 \Omega _k E_1\right )x_k + (B_1 + F_2 \Phi _k E_2)w_k + (B_2 + F_3 \Psi _k E_3)u_k,\\ z_k &= C_1 x_k + D_{12} u_k,\\ y_k &= C_2 x_k + D_{21} w_k, \end {aligned} $$
(6.13)

where \( k \in \mathbb {Z}\), \({\mathbf {E}\thinspace }|x_{-\infty } |^2<+\infty \), \(x_k\in {{\mathbb R}}^n\) is the state, \(z_k\in {{\mathbb R}}^{p_1} \) is the controllable output, \(u_k\in {{\mathbb R}}^{m_2} \) is the control input, \(w_k\in {{\mathbb R}}^{m_1} \) is the disturbance, and \(y_k\in {{\mathbb R}}^{p_2} \) is the measured output. All matrices occurring in system (6.13) are known except for the matrices \(\Omega _k \), \(\Phi _k \), and \(\Psi _k \) corresponding to the unknown parameters, of which we only know that they satisfy the constraints

$$ \Omega _k^\mathrm {T} \Omega _k \leqslant I, \quad \Phi _k^\mathrm {T} \Phi _k \leqslant I, \quad \Psi _k^\mathrm {T} \Psi _k \leqslant I, \quad k \in \mathbb {Z},$$
(6.14)

where \(I \) denotes the identity matrix of appropriate dimension.

The transfer function of the closed-loop system from the input \(W=\{w_k\}_{k\in \mathbb {Z}} \) to the output \(Z=\{z_k\}_{k\in \mathbb {Z}} \) is given, according to (2.16), by a lower fractional-linear transformation of the pair \((F,K) \),

$$ {\mathcal {L}}(F,K) = F_{{11}} + F_{12}K(I - F_{{22}}K)^{-1}F_{21}.$$

The a priori information on the probability distribution of the sequence \(W \) is as follows: \(W \) is a stationary Gaussian random sequence whose mean anisotropy is bounded from above by a nonnegative parameter \(a \). Let us denote the set of such sequences by \(\mathcal {W}_a \).

We introduce the matrix \(\Delta _k = \mathrm{diag}\left \{\begin {array}{ccc} \Omega _k, & \Phi _k, & \Psi _k \end {array}\right \} \) of all uncertainties in the system; then the set of inequalities (6.14) can be written in the form

$$ \Delta _k^\mathrm {T} \Delta _k \leqslant I, \quad k \in \mathbb {Z}.$$
(6.15)

An uncertainty \(\Delta _k \) satisfying (6.15) will be said to be admissible. The set of all admissible uncertainties for the system \(F \) will be denoted by \({I\!\!D} \),

$$ \begin {array}{c} {{I\!\!D}} = \left \{ \Delta _k= {\rm diag}\left \{ \begin {array}{{ccc}} \Omega _k, & \Phi _k, & \Psi _k \end {array} \right \} :\quad \Delta _k^\mathrm {T}\Delta _k \leqslant I, \quad k \in \mathbb {Z} \right \}. \end {array}$$
(6.16)

Before formulating the problem, we introduce the notions of nonanticipating and admissible controller.

A linear controller \(K\) is said to be strictly nonanticipating if for each \(k \) the control \(u_k \) depends only on previous observations \(y_j \), \(j\leqslant k \). A controller \(K \) is said to be admissible if it is strictly nonanticipating and internally stabilizes the closed-loop system \({\mathcal {L}}(F,K)\). The set of all admissible controllers for a given system \(F \) will be denoted by \(\mathbb {K} \).

Now let us state the problem of robust stochastic \(H_\infty \)-optimization for a system with parametric uncertainty.

Problem 6.

Given a system (6.13) and an upper bound \(a \geqslant 0 \) of the input anisotropy level, find an admissible controller \(K \in {\mathbb {K}}\) that minimizes the maximum value of the \(a \)-anisotropic norm of the closed-loop system \({\mathcal {L}}(F,K)\) over all uncertainties \(\Delta _k\in {{I\!\!D}} \), i.e., delivers a minimum to the cost functional

$$ J_0(K) = \sup _{\Delta _k\in {{I\!\!D}}} {| \! | \! |} {\mathcal {L}}(F,K) {| \! | \! |}_a .$$
(6.17)

The optimal problem (6.17) is equivalent to the optimal problem

$$ \sup _{\Delta _k\in {{I\!\!D}}} \sup _{W \in \mathcal {BW}_a} \|Z\|_\mathcal {P}^2 \rightarrow \inf _{K \in {\mathbb {K}}}. $$
(6.18)

Here \(\mathcal {BW}_a\) is the set of normalized input signals with bounded mean anisotropy,

$$ \mathcal {BW}_a=\left \{W\in \mathcal {W}_a:\|W\|_\mathcal {P}=1\right \}.$$

The stated problem is solved by embedding it in a more general \(H_{\infty } \)-optimization problem. In the new problem, the plant is the unperturbed plant in Problem 6, and the effect of parametric uncertainty is replaced by a new, additional plant input.

It has been shown that the value of the cost functional of the new problem majorizes the value of the cost functional of the original one.

The solution of the new, more general problem is reduced to solving four coupled algebraic Riccati equations, a Lyapunov equation, and an equation of a special form. To solve these equations, a modified homotopy method is presented in [162].

The main disadvantage of the proposed technique is the high computational complexity of the homotopy method. This disadvantage was overcome by using matrix inequalities. The conditions for the synthesis of static and dynamic controllers based on convex optimization methods were stated in [75] for a system with fractional-linear uncertainties of the form

$$ \begin {aligned} \left [\begin {array}{c}x_{k+1} \\ z_k \\ y_k \\ \end {array}\right ]&= \left ( \left [ \begin {array}{{ccc}} A & B_w & B_u \\ C_z & D_{zw} & D_{zu} \\ C_y & D_{yw} & 0 \\ \end {array} \right ]\right . \\ &\qquad {}+ \left . \left [\begin {array}{c} B_\Delta \\ D_{z\Delta } \\ D_{y\Delta } \\ \end {array}\right ] \Delta \left (I_{p_\Delta }-D_{{\Delta \Delta }}\Delta \right )^{-1} \left [\begin {array}{{ccc}}C_\Delta & D_{\Delta w} & D_{\Delta u}\\ \end {array}\right ]\right ) \left [\begin {array}{c} x_{k} \\ w_k \\ u_k \\ \end {array}\right ], \end {aligned}$$
(6.19)

where \(x_k \) is the \(n_x \)-dimensional state vector, \(w_k \) is the \(m_w \)-dimensional random stationary sequence with bounded mean anisotropy level \({\overline {\mathbf {A}}}(W) \leqslant a \), \(u_k \) is the \(m_u \)-dimensional control, \(z_k \) is the \(p_z \)-dimensional controllable output, and \(y_k \) is the \(p_y \)-dimensional measured output. Note that \(p_z\leqslant m_w\), the pair \(\left ( A,\thinspace B_u\right )\) is stabilizable, and the pair \(\left (A,\thinspace C_y\right )\) is detectable. The time-invariant uncertainty \(\Delta \in {{\mathbb R}}^{m_\Delta \times p_\Delta }\) satisfies the condition \(\Delta ^\mathrm {T} \Delta \leqslant \gamma ^{-2}I_{p_\Delta }\) for a given \(\gamma >0 \). It is obvious that this setting generalizes the problem for a system written in the form (6.13).

To solve the problem using an upper fractional-linear transformation, an additional \(m_\Delta \)-dimensional input of uncertainty \(w_{\Delta _k} \) and a \(p_\Delta \)-dimensional output of uncertainty \(z_{\Delta _k} \) such that \(w_{\Delta _k}=\Delta z_{\Delta _k}\) and \({\mathbf {E}\thinspace }|w_{\Delta _k}|^2\leqslant \gamma ^{-2}{\mathbf {E}\thinspace }|z_{\Delta _k}|^2 \) were introduced.

The paper [213] solved the problem of synthesizing a robust suboptimal anisotropy-based state controller for a similar system without increasing the problem dimensionality.

Now let us give the statement of the problem following [17]. For the plant we consider discrete-time systems given in the state space in the form

$$ x_ {k+1} = A^\Delta x_k + B_w^\Delta w_k+B_u u_k, $$
(6.20)
$$ y_k = C_y^\Delta x_k+D_{yw}^\Delta w_k, $$
(6.21)
$$ z_k = C_z^\Delta x_k+D_{zw}^\Delta w_k+D_{zu} u_k, $$
(6.22)

where \(x_k \in {{\mathbb R}}^n \) is the state vector, \(u_k \in {{\mathbb R}}^{m_1} \) is the control input, \(w_k \in {{\mathbb R}}^m \) is the random stationary sequence with a bounded mean anisotropy level \({\overline {\mathbf {A}}}(W) \leqslant a \), \(y_k \in {{\mathbb R}}^p \) is the measured output, \(z_k\in {{\mathbb R}}^{p_1} \) is the controllable output, \(A^\Delta =A+M_A\Delta N_A \), \(B_w^\Delta =B_w+M_B\Delta N_B \), \(C_z^\Delta =C_z+M_C\Delta N_C \), \(C_y^\Delta =C_y+M_{Cy}\Delta N_{Cy}\), \(D_{yw}^\Delta =D_{yw}+M_{Dy}\Delta N_{Dy}\), and \(D_{zw}^\Delta =D_{zw}+M_D\Delta N_D\). The matrices \(A \), \(B_w \), \(B_u \) \(C\), \(D_w \), \(C_z \), \(D_{zw} \), \(D_{zu} \), \(M_A \), \(N_A \), \(M_B \), \(N_B \), \(M_C \), \(N_C \), \(M_D \), \(N_D \), \(M_{Cy} \), \(N_{Cy} \), \(M_{Dy} \), and \(N_{Dy} \) are constant and have appropriate dimensions.

The matrix \(\Delta \in {{\mathbb R}}^{q\times q} \) is unknown and is bounded in the spectral norm \(\overline {\sigma }(\Delta )\leqslant 1\); i.e., \(\Delta ^\mathrm {T} \Delta \leqslant I_q\).

It can be shown that in certain cases, the parametric uncertainties of the plant (6.19) can also be represented in the form (6.20)–(6.22). However, these sets are not identical. For instance, if \(D_{\Delta \Delta }=0\) in the expression (6.19), then the fractional-linear uncertainties are a subset of the set of uncertainties in the class (6.20)–(6.22). If, however, \( D_{\Delta \Delta }\neq 0\), then the uncertainties belong to different sets.

Conditions for synthesizing static robust suboptimal anisotropic state ( \(u_k=Fx_k \)) and output (\(u_k=K y_k \)) feedback controllers were obtained for system (6.20)–(6.22).

6.4. Anisotropy-Based Theory of Descriptor Systems

As was mentioned above, control systems closed by anisotropy-based controllers lie “in between” systems closed by \(H_2\)-controllers and systems closed by \(H_{\infty }\) -controllers. Descriptor systems are a generalization of ordinary dynamical systems, since their models contain not only difference or differential equations but also algebraic relations between state variables. The term “descriptor” came from foreign literature, since the state variables of such systems are physical variables (hence the term “descriptor”). Descriptor systems have more applications than ordinary systems, and so the generalization of theories known for ordinary systems was a natural development of control theory. There exist theories for designing \(H_{2} \)- and \(H_{\infty } \)-controllers for linear discrete descriptor systems [92, 105, 109, 110, 142, 237, 238]. Naturally, there is a desire to create an anisotropy-based theory for discrete descriptor systems.

The state-space model of linear time-invariant descriptor system has the form

$$ E x_{k+1} = A x_k+B w_k,$$
(6.23)
$$ y_k = C x_k+D w_k.$$
(6.24)

The main distinction of the form of notation from ordinary systems is the presence of the matrix multiplier \(E\) on the left-hand side. Here the matrix \(E\) is assumed to be singular. Owing to the matrix \(E\) being singular, it is impossible to write Eqs. (6.23)–(6.24) in the form (5.13)–(5.14). This leads to descriptor systems beginning to manifest a behavior not typical of ordinary systems; namely (see [92, 221]),

  1. The transfer function of a descriptor system may not be strictly proper.

  2. For arbitrary bounded initial conditions, the time-domain response of descriptor systems may exhibit an impulsive or noncausal behavior together.

  3. Descriptor systems usually contain three types of modes: bounded dynamic modes, unbounded dynamic modes, and nondynamic modes; undesired impulsive behavior in descriptor systems can be generated by unbounded dynamic modes.

  4. Even if the descriptor system is impulse-free, it can still have type I discontinuities due to inconsistent initial conditions.

More details on discrete- and continuous-time descriptor systems can be found in [13, 15, 109].

Owing to the above features, constructing an anisotropy-based theory for such systems is not a trivial generalization of the anisotropy-based theory for normal systems.

The first problem solved within the framework of constructing an anisotropy-based theory for discrete-time descriptor systems was the problem of calculating the anisotropic norm of a descriptor system [12]. To calculate the anisotropic norm, one has to solve an algebraic matrix Riccati equation, a Lyapunov equation, and an equation of a special form. These three equations include the parameters of a normal system equivalent to the given descriptor system. An algorithm for calculating the anisotropic norm of a discrete-time descriptor system using convex optimization was proposed in [90]. The problems of analyzing descriptor systems affected by a random Gaussian sequence with a nonzero expectation were considered in [81].

The first paper [89] on the synthesis of optimal anisotropy-based control for descriptor systems was devoted to designing a state feedback control. The problem was reduced to solving two algebraic matrix Riccati equations, a matrix Lyapunov equation, and an equation of a special form. The problem of output feedback optimal control design was stated and solved in [14]. The solution of the problem was reduced to a two-step control synthesis procedure. At the first step, the system was causalized and reduced to the form of an equivalent normal system, and at the second step, the well-known technique for solving the synthesis problem for a normal system was applied.

Conditions for the boundedness of the anisotropic norm of a descriptor system were obtained in [1] based on solving a generalized Riccati equation and checking a number of inequalities. The problem of suboptimal state feedback and full information control (state and disturbance) was solved in [82] using generalized algebraic Riccati equations. A new method for calculating the anisotropic norm and searching for a suboptimal anisotropy-based controller based on strict matrix inequalities was proposed in [16].

More details about the stated and solved anisotropy-based control problems for descriptor systems can be found in the monograph [91].

6.5. Anisotropy-Based Analysis of Systems with Noncentered Input Signals

As was mentioned above, stationary ergodic sequences of Gaussian random vectors with zero mean were considered for external input disturbances in the initial statements of the problems of anisotropy-based analysis and control design [20, 21, 24, 70, 164, 225]. The expectations being equal to zero means that the infinite-horizon average error owing to the presence of such disturbances depends only on the covariance matrices of the vectors in the sequence. However, in practice, given various equipment failures or the presence of a nontrivial external disturbance, the mean values of the disturbance vectors are nonzero. In this regard, within the framework of the anisotropy-based theory, it makes sense to consider stationary ergodic sequences of Gaussian random vectors with nonzero means as an external disturbance. The assumption about the vectors in the sequence having a constant (the same) expectation does not violate the stationarity and ergodicity of the sequence. Moreover, under certain conditions, these properties are preserved even if the expectations vary over time. Thus, considering the case of nonzero mean of vectors of the input sequence in the anisotropy-based theory actually expands the boundaries of its application. The idea of constructing an anisotropy-based theory with a nonzero expectation of the input signal belongs to Karny [165].

To construct an anisotropy-based control theory with noncentered random input signals, one has to modify the definitions of anisotropy, mean anisotropy, and anisotropic norm in the case where the input disturbance vectors have nonzero expectations. Next, one needs to consider the problems of analysis and produce algorithms for calculating anisotropy, mean anisotropy, and anisotropic norm in the frequency and time domains. The final goal is to develop anisotropy-based control design methods for systems that are affected by an uncentered random sequence as input. Consider two Gaussian \(m\)-dimensional random vectors \(w \) and \(v \) with distribution densities

$$ \begin {aligned} f(x) &= \Big ( (2\pi )^{m}\det (\Sigma ) \Big )^{-1/2} \exp \left \{ -\dfrac {1}{2}(x-\mu )^{\mathrm {T}}\Sigma ^{-1}(x-\mu ) \right \},\\ p_{\lambda }(x) &= \big (2\pi \lambda \big )^{-m/2} \exp \left \{ -\dfrac {x^{\mathrm {T}}x}{2\lambda } \right \}, \quad x\in \mathbb {R}^{m}, \end {aligned}$$

respectively. The random vector \(w\) has the nonzero expectation \( \mathbf {E}[w] = \mu \neq 0\) and the covariance matrix \(\mathbf {cov}(w) = \Sigma = \Sigma ^{T} \succ 0\), while the expectation of the vector \(v\) is zero and its covariance matrix is scalar, \(\mathbf {cov}(v) = \lambda I_{m} \). For the measure of distinction of \(w \) from \(v \), as previously, we will use the relative entropy (5.4), which, in this case, is

$$ D(f||p_{\lambda }) = - h(w) + \dfrac {m}{2}\ln (2\pi \lambda ) + \dfrac {\mathrm {tr}(\Sigma )+|\mu |^{2}}{2\lambda }, $$
(6.25)

because \(\mathbf {E}[|w|^{2}] = \mathrm {tr}(\Sigma ) + |\mu |^{2}\).

The anisotropy of a random vector is defined as a minimum, in the sense of the parameter \(\lambda >0 \), value of the relative entropy \(D(f||p_{\lambda }) \),

$$ \mathbf {A}(w) = \min \limits _{\lambda >0} D(f||p_{\lambda });$$

in view of (6.25), this leads to

$$ \mathbf {A}(w) = -\dfrac {1}{2}\ln \det \left ( \dfrac {m\Sigma }{\mathrm {tr}(\Sigma ) + |\mu |^{2}} \right ).$$

Comparing the resulting formula for the anisotropy of a random vector \(w\) with nonzero mean \(\mu \) and the expression in (5.10) determining the anisotropy of a random vector with zero expectation, we conclude that

$$ \mathbf {A}(w) |_{\mu \neq 0} = \mathbf {A}(w) |_{\mu =0} + \dfrac {1}{2}\ln \det \left ( \dfrac {\mathrm {tr}(\Sigma ) + |\mu |^{2}}{\mathrm {tr}(\Sigma )} \right ).$$

In other words, the presence of nonzero expectation always increases the measure of distinction of \(w \) from the set of reference random vectors with densities \( \{p_{\lambda }(x)\!:\thinspace \lambda >0\}\). Moreover, for two distinct random vectors \(w_{1}\) and \(w_{2} \) with expectations \(\mu _{1} \) and \(\mu _{2} \) and identical covariance matrices \(\mathbf {cov}(w_{1}) =\mathbf {cov}(w_{2}) = \Sigma \), the anisotropy is the greater, the greater the Euclidean norm of the expectation,

$$ \mathbf {A}(w_{1}) \geqslant \mathbf {A}(w_{2}) \quad \Leftrightarrow \quad |\mu _{1}| \geqslant |\mu _{2}|. $$

In the case of a nonzero expectation, the mean anisotropy admits the representation

$$ \overline {\mathbf {A}}(W) = -\dfrac {1}{4\pi } \int \limits _{-\pi }^{\pi } \ln \det \left ( \dfrac {mS(\omega )}{\|G\|_{2}^{2}+|\mathcal {M}|^{2}} \right ) d\omega , $$
(6.26)

where the vector \(\mathcal {M} \) satisfies

$$ \mathcal {M} = \left (D_g+C_g(I-A_g)^{-1}B_g\right )\mu $$
(6.27)

and the matrices \(A_g, \) \(B_g, \) \(C_g, \) \(D_g \) define a shaping filter \(G \) in the state space as follows:

$$ \begin {aligned} x_{k+1} &= A_gx_{k} + B_g(v_{k}^{}+\mu ),\\ w_{k}^{} &= C_gx_{k} + D_g(v_{k}^{}+\mu ). \end {aligned} $$
(6.28)

An algorithm for calculating the mean anisotropy with nonzero mean is similar to the algorithm for calculating the mean anisotropy in the case of a centered disturbance [113, 207].

The following formula establishes a relationship between the value of the mean anisotropy of a sequence with nonzero expectations and the mean anisotropy of a sequence with the zero expectations,

$$ \overline {\mathbf {A}}(W) = \overline {\mathbf {A}}_{o}(W) + \dfrac {m}{2} \ln \left ( \dfrac {\|G\|_{2}^{2}+|\mathcal {M}|^{2}}{\|G\|_{2}^{2}} \right ),$$
(6.29)

where \(\|G\|_{2}^{2} = \mathrm {tr}(\Sigma )\) is the \(H_{2} \)-norm of the transfer function \(G(z) \).

Consider a linear discrete time-invariant system \(F\in H_{\infty }^{p \times m} \) with \(m \)-dimensional input \(W \), \(p \)-dimensional output \(Z \), and constant matrices \(A \in \mathbb {R}^{n \times n}\), \(B \in \mathbb {R}^{n \times m}\), \(C \in \mathbb {R}^{p \times n}\), and \(D \in \mathbb {R}^{p \times m}\),

$$ \begin {aligned} x_{k+1} &= Ax_{k} + Bw_{k},\\ z_{k} &= Cx_{k} + Dw_{k}. \end {aligned}$$
(6.30)

The transfer function of such a system is \(F(z) = D +C(z^{-1}I_{n}-A)^{-1}B\). For the input \(W \) we take a sequence with mean anisotropy bounded above by a number \(a\geqslant 0\), i.e., the sequence generated by the shaping filter (6.28) from the set

$$ \mathbb {G}_{a} = \Big \{ G\in H_{2}^{m \times m} :\; \overline {\mathbf {A}}(G) \leqslant a \Big \}.$$

The anisotropic norm of system (6.30) is

$$ {{|\!|\!|}{F}{|\!|\!|}}_{a} = \sup \limits _{W:\; \overline {\mathbf {A}}(W) \leqslant a} Q(F,W) \;=\; \sup \limits _{ G \in \mathbb {G}_{a} } \sqrt { \dfrac {\|FG\|_{2}^{2}+|\mathcal {F}|^{2}} {\|G\|_{2}^{2}+|\mathcal {M}|^{2}} },$$
(6.31)

where \(\mathcal {M} \) and \(\mathcal {F} \) are, respectively, the expectations of the vectors of the input and output sequences \(W\) and \(Z \) in the steady-state mode,

$$ \begin {aligned} \mathcal {M} &= \lim \limits _{k\to \infty } \mathbf {E}[w_{k}] = \left (D_g+C_g(I-A_g)^{-1}B_g\right )\mu ,\\ \mathcal {F} &= \lim \limits _{k\to \infty } \mathbf {E}[z_{k}] = \left (D+C(I-A)^{-1}B\right )\mathcal {M}. \end {aligned} $$

Ideas for anisotropy-based control with noncentered input disturbances were published in [168].

6.6. Anisotropy-Based Theory for Time-Varying Systems

An approach to the construction of the theory of robust anisotropy-based control and filtering in time-varying systems is being developed in the recent years. The problem of anisotropy-based analysis of robust performance of linear discrete time-varying systems on a finite time interval was considered for the first time in [24]. This paper gave definitions of anisotropy and anisotropic norm that corresponded to new problems of time-varying anisotropy-based finite-horizon control theory.

The next important paper in the construction of anisotropy-based theory for finite-horizon systems in the suboptimal setting was the paper [173]. The definitions of anisotropy of a random vector and anisotropic norm of a linear discrete time-varying system introduced in [24] are the basis for solving problems of anisotropy-based control and filtering on a finite time interval. Necessary and sufficient conditions for the boundedness of the anisotropic norm of a system with time-varying parameters are considered in [173]. These conditions assume solving a difference Riccati equation and an inequality for the determinant of a positive definite matrix. On the basis of this criterion, the paper [74] studied and solved the anisotropy-based control design problem for a linear discrete time-varying system that guarantees the boundedness of the anisotropic norm of the closed-loop system by a given threshold value on a finite horizon. Sufficient conditions are obtained for the existence of an anisotropy-based controller with variable parameters that guarantees the boundedness of the anisotropic norm of the closed-loop linear discrete time-varying system with a given threshold value on the finite horizon. These conditions define the method for calculating the matrix of controller parameters from the recursive solution of systems of matrix inequalities or a sequence of optimization problems.

The problem of analyzing a time-varying linear finite-horizon system with noncentered input disturbances was solved in [40]. In this paper, it was shown that calculating the anisotropic norm of the specified class of systems in the state space is associated with solving systems of difference matrix equations and equations of a special form. The solution of the suboptimal control problem for time-varying systems on a finite horizon was obtained in [74].

6.7. Anisotropy-Based Filtering

It is well known that the \(H_{2}\)- and \(H_{\infty } \)-filtering theories were developed [79, 147, 204] alongside with the \(H_{2} \)- and \(H_{\infty } \)-control theories. These filtering theories share all the features found in the counterpart control theories. In Kalman filtering, it is assumed that the model of the process dynamics and the statistical characteristics of the model and measurement noise are precisely known. The variance of the estimation (filtering) error is a quadratic optimality criterion. The Kalman filter, which provides the minimum variance of the estimation error, is a system state estimator. Minimizing or bounding the variance of the filtering error is equivalent to minimizing or bounding the \(H_{2}\)-norm of the filtering error operator. It is also well known that the Kalman filter (or \(H_{2} \)-filter) synthesized for a given model is not robust; i.e., it may lose stability with small changes in the mathematical model of the plant.

The \(H_{\infty }\)-criterion can be used in the case where there is no exact a priori information about the plant model and the statistical properties of the model and measurement noise. The filter synthesized based on the criterion of the boundedness of the \(H_{\infty }\)-norm guarantees that the \(H_{\infty }\)-norm of the operator connecting the input disturbance signal and the estimation error does not exceed a given positive number. The \(H_{\infty }\)-filtering algorithms belong to the class of minimax algorithms that minimize the worst-case estimation error (see, e.g., [131, 135, 183, 203, 204]).

It should be noted that the filtering algorithms are considered in these theories both on finite and infinite horizons [135].

Just as in the case of control, optimal \(H_{2} \)- and \(H_{\infty } \)-filters efficiently operate only when the input signals belong to the classes that were assumed when these theories were created. The use of the \(H_{2} \)-estimator in the case of a strongly “colored” input disturbance usually leads to unsatisfactory estimation errors, while the \(H_{\infty } \)-estimator, designed for the worst case, with an input disturbance in the form of white or weakly “colored” noise is unnecessarily conservative [214].

One direction for the synthesis of filters that are less conservative than \(H_{\infty }\) -filters and more robust than \(H_{2}\) -filters are the so-called “mixed” \(H_{2}\) /\(H_{\infty }\) -filters (see, e.g., [119, 133, 156, 253]). In this approach, the \(H_{2}\)-criterion is minimized for a given constraint on the \(H_{\infty }\) -criterion. The \(H_{2}\) -, \(H_{\infty }\) -, and mixed \(H_{2}/H_{\infty }\) -filtering methods are based on the solutions of the Riccati equations or linear matrix inequalities (see, e.g., [8, 101]). Achieving a trade-off between \(H_{2}\) -optimal and \(H_{\infty }\)-optimal filters is considered in the generalized \(H_{\infty }\) -filtering problem [9], in which the joint influence of unknown initial conditions and an unmeasurable external disturbances on the estimation error is minimized. Algorithms for synthesizing time-varying filters were proposed for linear time-varying systems. These algorithms are based on the recurrent solution of the system of difference linear matrix inequalities [124]. Moreover, each step requires solving a system of linear matrix inequalities.

The first work to appear on anisotropy-based optimal filtering for time-invariant systems on infinite horizon was [227].

The problem of optimal anisotropy-based filtering for linear time-varying discrete-time systems on a finite horizon was solved in [226]. The problem boils down to solving two difference Riccati equations in direct and reverse time.

The problem of anisotropy-based suboptimal filtering was solved by convex optimization methods in [61].

The paper [173] permitted one to solve the anisotropy-based filtering problem for linear time-varying systems on a finite horizon for the special case of equal dimensions of the estimated output and external disturbance [241]. A scalar auxiliary matrix variable was introduced when solving the last problem in [241]; this leads to a significant increase in conservatism.

The restrictions [241] on the equality of the dimensions of the estimated output and external disturbance were lifted in [63]. The problem of robust filtering on a finite horizon for a linear discrete time-varying system with measured and estimated outputs and with an inaccurately known probability distribution of the input disturbance was solved. The magnitude of the estimation error was quantitatively characterized by the anisotropic norm. The problem of finding a suboptimal anisotropic estimator was reduced to a convex optimization problem. An algorithm for searching for a suboptimal anisotropic estimator based on recursive solution of a system of matrix inequalities was presented.

The problem of stochastic anisotropy-based robust filtering on an infinite horizon was considered in [215] for a linear discrete time-invariant system subject to a noncentered random disturbance with an inaccurately known probability distribution. A sufficient condition for the anisotropic norm of a linear discrete time-invariant system to be strictly bounded by a given threshold value (the bounded real lemma) was proved in terms of matrix inequalities. A sufficient condition for the existence of an estimator that guarantees the boundedness of the anisotropic norm of the estimation error operator by a given threshold value was stated.

A solution of the filtering problem for a linear discrete time-varying system on a finite horizon was obtained in [64]. It was assumed that the external disturbance has an anisotropy bounded above and additionally satisfies two constraints on the moments. The solution is based on the criterion of the boundedness of the anisotropic norm of the system and is reduced to finding a solution of a convex optimization problem.

6.8. Other Theories Linked to Relative Entropy

In minimax LQG, the relative entropy is considered for describing the uncertainty in the plant. Parametric uncertainties, including variations for system parameters, are singled out into a structure depicted in Fig. 8.

Fig. 8.
figure 8

To the description of uncertainties in the minimax LQG problem.

Mathematically, an uncertainty in the plant is described in the form

$$ \eqalign { x_{k+1} &= A x_k + B u_k + D \bar {w}_k, \cr z_k &= E_1 x_k + E_2 u_k, \cr y_k &= C x_k + \bar {v}_k.}$$

Here the initial conditions \(x_0\) and the noises \(\bar {w}_k \) and \(\bar {v}_k \) are random processes determined by an unknown probability distribution function \(\nu (\cdot )\). This problem is considered on a finite horizon; i.e., \(k={0,\ldots ,N}\). In this case, the relative entropy determines the “distance” between the Gaussian distribution functions \(\mu (\cdot )\) and \(\nu (\cdot ). \) The matrices \(E_1 \) and \(E_2 \) are known and form a model of uncertainty. Here the probability distribution function \(\nu (\cdot )\) determines an admissible disturbance if one has the inequality

$$ R\big (\nu (\cdot )\|\mu (\cdot )\big ) - \mathbf {E}_{\nu } \left [\dfrac {1}{2} \sum _{k=0}^{N}\limits \|z_k\|^2 + d \right ] \leqslant 0,$$
(6.32)

where \(d>0 \) is a constant scalar quantity, \(\mathbf {E}_{\nu } \) is the conditional expectation with respect to \(\nu (\cdot ) \), and \(R(\nu (\cdot )\|\mu (\cdot )) \) is the relative entropy between the probability distribution functions \(\mu (\cdot )\) and \(\nu (\cdot ) \), determined by the expression

$$ R\big (\nu (\cdot )\|\mu (\cdot )\big )= \begin {cases} \displaystyle \int _{\Omega }\limits \nu (\eta ) \log \dfrac { \nu (\eta )}{ \mu (\eta )} d\eta & \text {if } \nu (\eta )\ll \mu (\eta ) \text { and } \log \dfrac { \nu (\eta )}{ \mu (\eta )} \in \mathbb {L}_1 \\[.6em] +\infty & \text {otherwise.} \end {cases}$$
(6.33)

The notation \(\nu (\eta )\ll \mu (\eta ) \) means that the probability distribution function \(\nu (\eta ) \) is absolutely continuous with respect to the function \(\mu (\eta )\).

The minimax control problem in this setting consists in searching for the worst value of the expectation of a quadratic cost functional under constraints (6.33) and is expressed in the form

$$ V_\tau = \inf _{K\in \Lambda }\limits \sup _{\nu (\cdot )} \mathbf {E}_{\nu }[J_{\tau }], $$

where

$$ J_\tau = \dfrac {1}{2}x_{N+1}^\mathrm {T} Q_{N+1}x_{N+1} + \dfrac {1}{2}\sum _{k=0}^{N}\limits \Big [ x_k^\mathrm {T} Q_k + u_k^\mathrm {T} R u_k\Big ] - \tau \left [R\big (\nu (\cdot )\|\mu (\cdot )\big ) - d - \mathbf {E}_{\nu }\dfrac {1}{2} \sum _{k=0}^{N}\limits \|z_k\|^2 \right ]. $$

In this case, the relative entropy is used to characterize the uncertainty in the plant. However, as in the papers on the anisotropy-based control theory described just above, it refers to the description of the uncertainty of the input signal acting on the system and formed, among other things, owing to the uncertainty in the plant. Thus, the problem is to find an optimal control that minimizes a certain cost functional under constraints on the relative entropy of the random input signal. This problem is reduced to designing a control with the worst, in terms of relative entropy, input signal. The authors refer to this control as the minimax control. In contrast to papers on anisotropy-based control theory, the series of papers [189, 190, 202, 218, 219, 245, 246] on minimax control consider discrete descriptions of control systems. An analog of the description of uncertainty for systems defined by stochastic differential equations is the description using integral quadratic constraints. Within the framework of this approach, finite-horizon minimax control and filtering problems were solved in [217, 245].

Meaningful results in minimax control theory are obtained when the system is linear and the input signal distribution is Gaussian. In this case, the solution of the minimax linear-quadratic problem is reduced to an optimal risk-sensitive control problem [229]. The papers described above are strongly correlated with the papers [189,190,191,192,193,194,195,196,197,198,199,200,201,202], which also use relative entropy as an information characteristic.

7. CONCLUSIONS

The main purpose of the present paper was to review how the classical statements of the \(H_2 \)- and \(H_\infty \)-control problems were developed and modified in the second half of the 20th century and how minimax problems appeared as well as to establish their relationship with the stochastic robust control theory equipped with the anisotropy-based performance criterion created by I.G. Vladimirov and developed by A.P. Kurdyukov’s scientific school for over 20 years.

The authors apologize for the incomplete coverage of all approaches that lie in between the \(H_2 \)- and \(H_{\infty } \)-optimal and suboptimal control problems. Robust stochastic ideas and problem statements are presented in the survey to the extent that allowed us to emphasize the specific feature of the described approach and relate it to the anisotropy-based theory.

At the end of the survey, we would like to note the further direction of development of the anisotropy-based theory, in which the main case of study is linear discrete-time systems. In recent papers by V.A. Boichenko and A.P. Kurdyukov [18, 160], an attempt was made to expand the scope of the anisotropy-based theory to the case of external disturbances in the form of discrete- and continuous-time random processes with bounded \(l_2/L_2 \) or power norm. To this end, the concept of the \(\sigma \)-entropy of a random signal and the definition of the \(\sigma \)-entropy norm of a system were introduced. The axioms of this theory were built on the basis of the concept of correlation convolution of a random signal, which allows working with both stationary and nonstationary random processes using a spectral density matrix that encapsulates all the differences resulting from the choice of \(l_2/L_2 \) or power norm of a discrete- or continuous-time signal. The results of the \(\sigma \)-entropy analysis obtained are applicable for both continuous- and discrete-time systems. In the future, it is planned to develop this direction, moving on to the statement and solution of synthesis problems.