Abstract
Moving horizon estimation (MHE) is a state estimation method that is particularly useful for nonlinear or constrained dynamic systems for which few general methods with established properties are available. This entry explains the concept of full information estimation and introduces moving horizon estimation as a computable approximation of full information. The basic design methods for ensuring stability of MHE are presented. The relationships of full information and MHE to other state estimation methods such as Kalman filtering and statistical sampling are discussed.
Access provided by Autonomous University of Puebla. Download reference work entry PDF
Similar content being viewed by others
Keywords
Synonyms
Introduction
In state estimation, we consider a dynamic system from which measurements are available. In discrete time, the system description is
The state of the systems is \(x \in \mathbb{R}^{n}\), the measurement is \(y \in \mathbb{R}^{P}\), and the notation x+ means x at the next sample time. A control input u may be included in the model, but it is considered a known variable, and its inclusion is irrelevant to state estimation, so we suppress it in the model under consideration here. We receive measurement y from the sensor, but the process disturbance, \(w \in \mathbb{R}^{g}\); measurement disturbance \(v \in \mathbb{R}^{p}\); and system initial state, x(0), are considered unknown variables.
The goal of state estimation is to construct or estimate the trajectory of x from only the measurements y. Note that for control purposes, we are usually interested in the estimate of the state at the current time, T, rather than the entire trajectory over the time interval [0, T]. In the moving horizon estimation (MHE) method, we use optimization to achieve this goal. We have two sources of error: the state transition is affected by an unknown process disturbance (or noise), w, and the measurement process is affected by another disturbance, v. In the MHE approach, we formulate the optimization objective to minimize the size of these errors thus finding a trajectory of the state that comes close to satisfying the (error-free) model while still fitting the measurements.
First, we define some notation necessary to distinguish the system variables from the estimator variables. We have already introduced the system variables (x, w, y, v). In the estimator optimization problem, these have corresponding decision variables, which we denote by the Greek letters (χ, ω, η, ν). The relationships between these variables are
and they are depicted in Fig. 1. Notice that ν measures the gap between the model prediction η = h(χ) and the measurement y. The optimal decision variables are denoted \((\hat{x},\hat{w},\hat{y},\hat{v})\), and these optimal decisions are the estimates provided by the state estimator.
Full Information Estimation
The full information objective function is
subject to (2) in which T is the current time, \(\boldsymbol{\omega }\) is the estimated sequence of process disturbances, (ω(0), …, ω(T − 1)), y(i) is the measurement at time i, and \(\overline{x}_{0}\) is the prior, i.e., available, value of the initial state. Full information here means that we use all the data on time interval [0, T] to estimate the state (or state trajectory) at time T. The stage cost ℓ i (ω, ν) costs the model disturbance and the fitting error, the two error sources that we reconcile in all state estimation problems.
The full information estimator is then defined as the solution to
The solution to the optimization exists for all \(T \in \mathbb{I}_{\geq 0}\) under mild continuity assumptions and choice of stage cost. Many choices of (positive, continuous) stage costs ℓ x (⋅ ) and ℓ i (⋅ ) are possible, providing a rich class of estimation problems that can be tailored to different applications. Because the system model (1) and cost function (3) are so general, it is perhaps best to start off by specializing them to see the connection to some classic results.
Related Problem: The Kalman Filter
If we specialize to the linear dynamic model f(x, w) = Ax + Gw, h(x) = Cx, and let x(0), w, and v be independent, normally distributed random variables, the classic Kalman filter is known to be the statistically optimal estimator, i.e., the Kalman filter produces the state estimate that maximizes the conditional probability of x(T) given y(0), …, y(T). The full information estimator is equivalent to the Kalman filter given the linear model assumption and the following choice quadratic of stage costs
in which random variable x(0) is assumed to have mean \(\overline{x}_{0}\) and variance P0 and random variables w and v are assumed zero mean with variances Q and R, respectively. The Kalman filter is also a recursive solution to the state estimation problem so that only the current mean \(\hat{x}\) and variance P of the conditional density are required to be stored, instead of the entire history of measurements y(i), i = 0, …, T. This computational efficiency is critical for success in online application for processes with short time scales requiring fast processing.
But if we consider nonlinear models, the maximization of conditional density is usually an intractable problem, especially in online applications. So, MHE becomes a natural alternative for nonlinear models or if an application calls for hard constraints to be imposed on the estimated variables.
Moving the Horizon
An obvious problem with solving the full information optimization problem is that the number of decision variables grows linearly with time T, which quickly renders the problem intractable for continuous processes that have no final time. A natural alternative to full information is to consider instead a finite moving horizon of the most recent N measurements. Figure 2 displays this idea. The initial condition χ(0) is now replaced by the initial state in the horizon, χ(T − N), and the decision variable sequence of process disturbances is now just the last N variables \(\boldsymbol{\omega }= (\omega (T - N),\ldots,\omega (T - 1))\). Now, the big question remaining is what to do about the neglected, past data. This question is strongly related to what penalty to use on the initial state in the horizon χ(T − N). If we make this initial state a free variable, that is equivalent to completely discounting the past data. If we wish to retain some of the influence of the past data and keep the moving horizon estimation problem close to the full information problem, then we must choose an appropriate penalty for the initial state. We discuss this problem next.
Arrival Cost.When time is less than or equal to the horizon length, T ≤ N, we can simply do full information estimation. So we assume throughout that T > N. For T > N, we express the MHE objective function as
subject to (2). The MHE problem is defined to be
in which \(\boldsymbol{\omega }=\{\omega (T - N),\ldots,\omega (T - 1)\}\) and the hat on V distinguishes the MHE objective function from full information. The designer must now choose this prior weighting \(\Gamma _{k}(\cdot )\) for k > N.
To think about how to choose this prior weighting, it is helpful to first think about solving the full information problem by breaking it into two non-overlapping sequences of decision variables: the decision variables in the time interval corresponding to the neglected data (ω(0), ω(1), …, ω(T − N − 1)) and those in the time interval corresponding to the considered data in the horizon (ω(T − N), …, ω(T − 1)). If we optimize over the first sequence of variables and store the solution as a function of the terminal state χ(T − N), we have defined what is known as the arrival cost. This is the optimal cost to arrive at a given state value.
Definition 1 (arrival cost)
The (full information) arrival cost is defined for k ≥ 1 as
subject to (2) and \(\chi (k;\chi (0),\boldsymbol{\omega }) = x\).
Notice the terminal constraint that χ at time k ends at value x. Given this arrival cost function, we can then solve the full information problem by optimizing over the remaining decision variables. What we have described is simply the dynamic programming strategy for optimizing over a sum of stage costs with a dynamic model (Bertsekas 1995).
We have the following important equivalence.
Lemma 1 (MHE and full information estimation)
The MHE problem (5) is equivalent to the full information problem (4) for the choice\(\Gamma _{k}(\cdot ) = Z_{k}(\cdot )\)for all k > N and N ≥ 1.
Using dynamic programming to decompose the full information problem into an MHE problem with an arrival cost penalty is conceptually important to understand the structure of the problem, but it doesn’t yet provide us with an implementable estimation strategy because we cannot compute and store the arrival cost when the model is nonlinear or other constraints are present in the problem. But if we are not too worried about the optimality of the estimator and are mainly interested in other properties, such as stability of the estimator, we can find simpler design methods for choosing the weighting \(\Gamma _{k}(\cdot )\). We address this issue next.
Estimator Properties: Stability
An estimator is termed stable if small disturbances (w, v) lead to small estimate errors \(x -\hat{ x}\) as time increases. Precise definitions of this basic idea are available elsewhere (Rawlings and Ji 2012), but this basic notion is sufficient for the purposes of this overview. In applications, properties such as stability and insensitivity to model errors are usually more important than optimality. It is possible for a filter to be optimal and still not stable. In the linear system context, this cannot happen for “nice” systems. Such nice systems are classified as detectable. Again, the precise definition of detectability for the linear case is available in standard references (Kwakernaak and Sivan 1972). Defining detectability for nonlinear systems is a more delicate affair, but useful definitions are becoming available for the nonlinear case as well (Sontag and Wang 1997).
If we lower our sights and do not worry if MHE is equivalent to full information estimation and require only that it be a stable estimator, then the key result is that the prior penalty \(\Gamma _{k}(\cdot )\) need only be chosen smaller than the arrival cost as shown in Fig. 3. See Rawlings and Mayne (2009, Theorem 4.20) for a precise statement of this result. Of course this condition includes the flat arrival cost, which does not penalize the initial state in the horizon at all. So neglecting the past data completely leads to a stable estimator for detectable systems. If we want to improve on this performance, we can increase the prior penalty, and we are guaranteed to remain stable as long as we stay below the upper limit set by the arrival cost.
Related Problem: Statistical Sampling
MHE is based on optimizing an objective function that bears some relationship to the conditional probability of the state (trajectory) given the measurements. As discussed in the section on the Kalman filter, if the system is linear with normally distributed noise, this relationship can be made exact, and MHE is therefore an optimal statistical estimator. But in the nonlinear case, the objective function is chosen with engineering judgment and is only a surrogate for the conditional probability. By contrast, sampling methods such as particle filtering are designed to sample the conditional density also in the nonlinear case. The mean and variance of the samples then provide estimates of the mean and variance of the conditional density of interest. In the limit of infinitely many samples, these methods are exact. The efficiency of the sampling methods depends strongly on the model and the dimension of the state vector n, however. The efficiency of the sampling strategy is particularly important for online use of state estimators. Rawlings and Bakshi (2006) and Rawlings and Mayne (2009, pp. 329–355) provide some comparisons of particle filtering with MHE and also describe some hybrid methods combining MHE and particle filtering.
Summary and Future Directions
MHE is one of few state estimation methods that can be applied to nonlinear models for which properties such as estimator stability can be established (Rao et al. 2003; Rawlings and Mayne 2009). The required online solution of an optimization problem is computationally demanding in some applications but can provide significant benefits in estimator accuracy and rate of convergence (Patwardhan et al. 2012). Current topics for MHE theoretical research include treating bounded rather than convergent disturbances and establishing properties of suboptimal MHE (Rawlings and Ji 2012). The current main focus for MHE applied research involves reducing the online computational complexity to reliably handle challenging large dimensional, nonlinear applications (Kuhl et al. 2011; Lopez-Negrete and Biegler 2012; Zavala and Biegler 2009; Zavala et al. 2008).
Recommended Reading
Moving horizon estimation has by this point a fairly extensive literature; a recent overview is provided in Rawlings and Mayne (2009, pp. 356–357). The following references provide either (i) general background required to understand MHE theory and its relationship to other methods or (ii) computational methods for solving the real-time MHE optimization problem or (iii) challenging nonlinear applications that demonstrate benefits and probe the current limits of MHE implementations.
Bibliography
Bertsekas DP (1995) Dynamic programming and optimal control, vol 1. Athena Scientific, Belmont
Kuhl P, Diehl M, Kraus T, Schloder JP, Bock HG (2011) A real-time algorithm for moving horizon state and parameter estimation. Comput Chem Eng 35:71–83
Kwakernaak H, Sivan R (1972) Linear optimal control systems. Wiley, New York. ISBN:0-471-51110-2
Lopez-Negrete R, Biegler LT (2012) A moving horizon estimator for processes with multi-rate measurements: a nonlinear programming sensitivity approach. J Process Control 22:677–688
Patwardhan SC, Narasimhan S, Jagadeesan P, Gopaluni B, Shah SL (2012) Nonlinear Bayesian state estimation: a review of recent developments. Control Eng Pract 20:933–953
Rao CV, Rawlings JB, Mayne DQ (2003) Constrained state estimation for nonlinear discrete-time systems: stability and moving horizon approximations. IEEE Trans Autom Control 48(2): 246–258
Rawlings JB, Bakshi BR (2006) Particle filtering and moving horizon estimation. Comput Chem Eng 30:1529–1541
Rawlings JB, Ji L (2012) Optimization-based state estimation: current status and some new results. J Process Control 22:1439–1444
Rawlings JB, Mayne DQ (2009) Model predictive control: theory and design. Nob Hill Publishing, Madison, 576 p. ISBN:978-0-9759377-0-9
Sontag ED, Wang Y (1997) Output-to-state stability and detectability of nonlinear systems. Syst Control Lett 29:279–290
Zavala VM, Biegler LT (2009) Optimization-based strategies for the operation of low-density polyethylene tubular reactors: nonlinear model predictive control. Comput Chem Eng 33(10):1735–1746
Zavala VM, Laird CD, Biegler LT (2008) A fast moving horizon estimation algorithm based on nonlinear programming sensitivity. J Process Control 18: 876–884
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag London
About this entry
Cite this entry
Rawlings, J.B. (2015). Moving Horizon Estimation. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5058-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5058-9_4
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5057-2
Online ISBN: 978-1-4471-5058-9
eBook Packages: EngineeringReference Module Computer Science and Engineering