Abstract
Model Predictive Path Integral (MPPI) control framework algorithms have been studied for use in autonomous control systems because they are convenient to implement using model predictive trajectory samples with a stochastic control approach. They can also deal extensivlely with complex desired costs and constraints. This paper presents a path following control algorithm based on the model predictive path integral control framework for autonomous vehicles. By using the importance sampling method in the model predictive control, the iterative path integral provides acceleration commands for a vehicle, allowing it to track a virtual target on a desired path and achieve the optimal trajectory under the constraints. The optimal acceleration commands are updated using a stochastic control approach using model predictive trajectory samples. This approach allowed us to efficiently solve the nonlinear control problem with complex costs and constraints, without intractable convexification or linearization. We implemented the Graphics Processing Unit (GPU) algorithm to show that this algorithm can quickly compute this problem. We tested the algorithm on various paths and under wind disturbance, using a nonlinear disturbance observer that allowed us to predict the model more correctly in an uncertain environment. The simulation results show that the algorithm is effective and applicable to path-following guidance for various paths under disturbances.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Guidance and control problems of Unmanned Aerial Vehicles (UAVs) have been widely studied in recent years as the use of UAVs in various fields has increased. One of these problems is the trajectory control problem, which concerns the controller that directs a vehicle to follow a predetermined path. This controller is typically designed by one of two approaches. One is the trajectory-tracking approach, which includes time constraits, and the other is the path-following approach that does not have time constraints [1].
There have been many studies on path-following using various approaches and techniques. Some studies have used the Lyapunov stability condition based on the Lyapunov theory for the convergence of the controller [2]. Other techniques for controlling nonlinear systems have been variously used to solve this problem, including backstepping [3], feedback linearization [4], and sliding mode control [5].
An algorithm that can handle the path-following problem as an optimization problem is Model Predictive Control (MPC). In the MPC framework, a finite horizon optimal control problem can be solved by repeating computation at each sampling instant within the time horizon [6]. This optimization approach can deal more conveniently with the various constraints on states and control inputs than other nonlinear system control techniques. In [7], they addressed a path-following problem in the presence of wind disturbances with the MPC approach.
Many MPC approaches may be intractable to solve optimization problems with complex desired costs and constraints, and require an optimization solver to deal with them. The Model Predictive Path Integral (MPPI) applies the path-integral control theory to solve the optimal control problem [8]. It obtains the optimal control sequence using model predictive trajectory samples without an optimization solver, and it is also comparably tractable and capable of handling the complex costs, constraints, and dynamics. In [9], they addressed an optimal control problem with MPPI in an aggressive driving task.
This paper presents a path-following control algorithm based on the model predictive path integral control for autonomous vehicles. The iterative path integral uses the importance sampling method in the model predictive control to provide some acceleration commands to the vehicle, to track a virtual target on the desired path, and achieve the optimal trajectory under some practical constraints. This approach allows us to efficiently solve the nonlinear control problem under complex costs and constraints without intractable convexification or linearization. We implemented the Graphics Processing Unit (GPU) algorithm to show that this algorithm can rapidly compute this problem. We tested the proposed algorithm on various paths and under wind disturbance using a nonlinear disturbance observer, that allowed us to predict the model more correctly in an uncertain environment. The simulation results showed that the algorithm was effective and applicable to the path-following guidance for various paths under disturbances.
The study of MPPI control for the path-following problem of UAVs is still in its infancy compared to other approaches, such as nonlinear control techniques, and convex optimization-based methods. This paper presents one approach to address the problem using the MPPI control framework. To be clear, this study is not intended to replace the existing approaches mentioned above, but to extend a methodology that works using an attractive approach. In particular, aerial vehicles are vulnerable to disturbances such as wind, which can affect its stability, guidance and control methods. Therefore, in this study, disturbance estimation and model prediction with it were added to the existing MPPI control framework.
2 Problem Formulation
To formulate the path-following problem with a virtual target, it is necessary to define a UAV dynamics model and a look-ahead virtual target on the desired path.
2.1 UAV Dynamics Model
In this study, we consider a two-dimensional planar kinematic model to describe UAV motion as follows,
where the subscripts R and B denote the inertial reference frame and the body frame, respectively. \(\left( {x_{R} , y_{R} } \right)\) stands for the inertial position, V is the airspeed, \(\uppsi \) represents the heading angle, \(\left( {W_{{x_{R} }} ,W_{{y_{R} }} } \right)\) denotes the wind disturbance, \(a_{{y_{B} }}\) represents lateral acceleration.
2.2 Virtual Target
It is assumed in this problem that a virtual target moves along the desired path away from a UAV by a look-ahead distance. The desired path is defined as lines connecting waypoints. The virtual target is located on the first waypoint at the beginning of the guidance phase. Then, the first desired path is the line between waypoints 1 and 2, as shown in Fig. 1, where, \(R_{la}\) is the look-ahead distance, \(\uplambda \) is the line-of-sight (LOS) angle. If the virtual target arrives at waypoint 2, the desired path changes the line between waypoints 2 and 3. This process repeats until the virtual target arrives at the final waypoint (Fig. 1).
3 MPPI Control
The MPPI control framework algorithm has a favorable feature: it is convenient to implement using model predictive trajectory samples with the stochastic control approach. Furthermore, it can deal with complex desired costs and constraints extensively. In other words, it just requires many sample trajectories based on Monte-Carlo simulation without other intractable tasks like obtaining derivatives, linearization, convexification.
This approach can be explained as the optimal control problem for a stochastic control and noise affine system. The optimal control problem can be expressed using the stochastic Hamilton-Jacobi-Bellman (HJB) equation, a type of Partial Differential Equation (PDE). By introducing the exponential transform with an assumption [8], the PDE can be linearized. The transformed linearized PDE is the linear Chapman-Kolmogorov equation, and one can apply the Feynman-Kac lemma to transform the equation into a path integral that takes the form of an expectation over trajectories. So, the path integral form of the optimal control has the expectation terms. It is possible to use the empirical expectation over thousands of sample trajectories with Monte-Carlo simulation. This is a brief explanation of the MPPI control method, and a more detailed explanation can be found in [8].
Let us consider a stochastic dynamic system.
where \(x_{t} \in {\mathbb{R}}^{n}\) denotes the state vector, \(u_{t} \in {\mathbb{R}}^{m}\) denotes the control input at time t, and \(\delta u_{t} \sim N\left( {0, \,\Sigma } \right)\) is a Gaussian distributed noise vector with a zero mean and variance \(\Sigma \). We can consider the stochastic optimal control problem that minimizes the following objective.
where the subscription T denotes the final time, \(\upphi \) is terminal cost, and q is the state-dependent running cost, \({\boldsymbol{R}} \in {\mathbb{R}}^{m \times m}\) is a positive definite matrix. Based on the MPPI algorithm [8, 9], we can determine the path integral form of the iterative optimal control as
where K is the number of random samples, \(\tilde{\boldsymbol{S}}\left( {\boldsymbol{\tau}} \right)\) is the modified cost to go, and \(\tilde{q}\left( {{\boldsymbol{x}}_{{\boldsymbol{t}}} ,{\boldsymbol{u}}_{{\boldsymbol{t}}} ,\boldsymbol{\delta u}_{{\boldsymbol{t}}} } \right)\) is the modified running cost. Hyper-parameter \(\uplambda \in {\mathbb{R}}^{ + }\) is the temperature, and \(\upupsilon \in {\mathbb{R}}^{ + }\) is the exploration variance. These modified terms are derived by following some simplifications as in [9]. The MPPI control algorithm is given in Algorithm 1.
4 Simulation Results
In this section, we present the results of various case studies. The first case study is to demonstrate the optimality of the MPPI controller. We set an MPPI controller whose cost function was the same as the objective of the pursuit guidance with optimal error dynamics. We then compared it with the pursuit guidance. The second case study was conducted to show the convenience of the MPPI controller when dealing with a complex cost function. To this end, we designed the cost function as desired. The final case study was performed to investigate the performance of the MPPI controller under wind disturbances, which make the model prediction inaccurate. We then tested the proposed method under the same environment using a nonlinear disturbance observer that helped to predict the model uncertainty more precisely. In the simulation studies, we implemented this algorithm using Python 3.0 with the pycuda library. The computer specifications were Intel(R) Core(TM) i7-10,700 CPU, 32 GB RAM, NVIDIA GeForce RTX 3090 GPU. The simulation condition was the 3DOF UAV simulation with the MPPI controller, K = 4096, N = 100 for 10 s in simulation time with the time step 0.01 s. In the condition, the recorded computation time was 25 s. If we changed the MPPI parameters to K = 1024, N = 20 and the acceleration update time step to 0.05 s while the whole simulation was still running on the 0.01 s time step, the recorded computation time was 5 s and the simulation result was almost the same.
4.1 Comparison of an MPPI Controller and Pursuit Guidance
In this subsection, the proposed MPPI controller is compared with the pursuit guidance to show the optimality of the MPPI controller. The pursuit guidance with optimal error dynamics is written as:
where k is the guidance gain, V is the airspeed of the UAV, \(\sigma\) is the line-of-sight angle, \(\psi\) is the heading angle, and \(t_{go}\) is the time-to-go, which is calculated as \(t_{go} = \frac{{R_{la} }}{V}\).
The desired heading angle error dynamics of the pursuit guidance law is given by:
The objective that pursues the minimum control effort by the optimal error dynamics is described [10]:
where \(u = a_{{y_{B} }}\) and \({ }t_{go} = t_{f} - t\). We compare the simulation results between the proposed MPPI control-based guidance and the pursuit guidance. The simulation setting and the sum of costs are provided in Table 1 and Table 2, respectively.
The cost of the MPPI controller in this section is the same as the pursuit guidance objective:
The difference between MPPI controller #1 and MPPI controller #2 is the hyper-parameter \({\uplambda }\) in Table 1. This is called temperature, which is a parameter similar to a Boltzmann distribution or softmax function. It makes lower-cost control input more weighted. If the \({\uplambda }\) is small, then the low-cost control input becomes more important. Moreover, its scale is also dependent on the scale of cost, N, K. We just compared the two MPPI controllers whose \({\uplambda }\) were 1000 and 10,000, respectively.
In Table 2, MPPI controller #1 has the lowest sum of costs. It shows that MPPI controller #1 is better than the pursuit guidance in this case. This is because \(t_{go} = \frac{{R_{la} }}{V}\) in (9) is actually different than \(t_{go} = t_{f} - t\) in (11) so that the control sequence using the pursuit guidance is not optimal for the whole trajectory. Thus, MPPI controller #1’s control input is closer to the optimal control than this pursuit guidance. However, MPPI controller #2 was worse than the pursuit guidance. This is because \({\uplambda }\) is too big, so that the optimality of the MPPI control is not as good, and it considers all other high-cost trajectories more equivalently.
The performance of the MPPI controller was better with a trapezoidal shaped path than the line shape path. This is because the trapezoidal path’s target changes its path. The MPPI controllers can predict the target’s change, and the change is taken into account by the controllers. However, persuit guidance can not predict that. It makes this different cost-sum between the MPPI controllers and the pursuit guidance especially in trapezoidal shaped path (Figs. 3, 4).
4.2 Complex Cost Function
In this subsection, we test how convenient the MPPI controller is to use when dealing with a complex cost function. The cost function is described:
In this case study, the scale of \({\uplambda }\) is smaller than in the previous case. This is because this case’s cost is smaller. In Table 4, the MPPI controllers have lower cost-sums than the pursuit guidance because while the MPPI controllers are designed to minimize the cost function (13), the pursuit guidance is not. In Fig. 5, the UAV under MPPI controller #1 is closer to the path at locations #1, #2, #4, but not at location #3. This is because the magnitude of the heading angle error increases and becomes dominant in the cost at location #3 due to the drastic change in path angle, so the MPPI controller #1 rotates the UAV in advance even though this increases the distance-to-path cost. This means that the MPPI controller provides good performance for the cost (13) (Table 3).
4.3 Simulation with Wind Disturbances
We tested the performance of an MPPI controller in the presence of slowly varying wind disturbances for two cases: when it can or cannot estimate the disturbance using the Nonlinear Disturbance Observer (NDO). There were three cases for the experiment. The first case was an MPPI controller without the NDO, which did not estimate wind disturbances. The second case was an MPPI controller with the NDO, which did estimate wind disturbances. The last case was an MPPI controller without wind disturbances for comparison with the other cases.
The nonlinear disturbance observer model is described [11]:
where l is a constant observer gain, z is an internal state variable, \(\hat{d}_{x}\) and \(\hat{d}_{y}\) are the estimated disturbances, and (x, y) is state variables. In this test, we set \(l_{x} = l_{y} = 20\). We set the same MPPI controller for all three cases. The details are summarized in Table 5. Assuming that the wind disturbances were slowly varying, \(v_{wx} = - 4 + \sin \left( {0.2\uppi \,{\text{t}}} \right), \,v_{wy} = 0\) are chosen.
The simulation results are presented in Table 6. It is the worst case when with wind disturbances and without NDO. This is a natural result because the MPPI controller’s prediction is incorrect due to the unpredictable wind disturbances in the trajectory samples. The cost-sum of the case with wind disturbances and NDO is higher than the case without wind disturbances. This is because the cost function of the MPPI controller includes the magnitude of the heading angle error. A gap exists between the azimuth attitude (i.e., heading direction) and the flight path angle (i.e., velocity direction) due to wind disturbances. The heading error cost continues because the flight path angle follows the path, but the heading is different than the flight path angle in this case with NDO. The Fig. 6 shows that the case with wind disturbance using NDO has the higher cost than the case without wind disturbance even though both cases follow the path well similarly.
In Fig. 7, the trajectory of the case without NDO is twisted because wind disturbances affected the UAV’s trajectory, and then the virtual target did not arrive at the third waypoint at once, so it moved to return toward the waypoint. However, the trajectory of the case with NDO is similar to the case without wind disturbances. The MPPI controller can have a good performance with the correct disturbance estimation (Figs. 7, 8).
5 Conclusion
In this paper, we proposed a path-following guidance using the Model Predictive Path Integral control to extend a methodology that works using an attractive approach. We tested the algorithm on various paths and under wind disturbance with a nonlinear disturbance observer, which allowed us to predict the model more accurately in an uncertain environment. We provided a comparison between the pursuit guidance and the MPPI controller, whose cost was the same as the pursuit guidance in our problem. The MPPI controller’s performance was better than the pursuit guidance due to the property of the model predictive control. We conducted a simple test of the complex cost function using the MPPI controller. It can deal easily with complex desired costs and constraints just by using the sample trajectories. We also considered wind disturbances in our experiment and tested MPPI controller using the disturbance estimation by a nonlinear disturbance observer under slowly varying wind disturbances.
References
Bartomeu R, Ramon P, Bernardo M (2020) A survey of path following control strategies for UAVs focused on quadrotors. J Intell Robot Syst 98:241–265
Chen Y, Liang J, Wang C, Zhang Y (2017) Combined of Lyapunov stable and active disturbance rejection control for the path following of a small unmanned aerial vehicle. Int J Adv Robot Syst 14(2)
Cabecinhas D, Cunha R, Silvestre C (2015) A globally stabilizing path following controller for rotorcraft with wind disturbance rejection. IEEE Trans Control Syst Technol 23(2):708–714
Roza A, Maggiore M (2012) Path following controller for a quadrotor helicopter. In: 2012 American Control Conference, pp 4655–4660
Yamasaki T, Balakrishnan SN, Takano H (2012) Integrated guidance and autopilot for a path-following UAV via high-order sliding modes. In: 2012 American Control Conference, Proceedings of the American Control Conference, pp 143–148
Mayne D, Rawlings J, Rao C, Scokaert P (2000) Constrained model predictive control: stability and optimality. Automatica 36(6):789–814
Rucco A, Aguiar AP, Pereira FL, De Sousa JB (2016) A predictive path-following approach for fixed-wing unmanned aerial vehicles in presence of wind disturbances. In: Reis L, Moreira A, Lima P, Montano L, Muñoz-Martinez V (eds) Robot 2015: Second Iberian Robotics Conference. AISC, vol 417, pp 623–634. Springer, Cham. https://doi.org/10.1007/978-3-319-27146-0_48
Williams G, Aldrich A, Theodorou EA (2017) Model predictive path integral control: from theory to parallel computation. J Guid Control Dyn 40(2):344–357
Williams G, Drews P, Goldfain B, Rehg JM, Theodorou EA (2016) Aggressive driving with model predictive path integral control. In: IEEE International Conference on Robotics and Automation, pp 1433–1440
He S, Lee CH (2018) Optimality of error dynamics in missile guidance problems. J Guid Control Dyn 41(7):1624–1633
Chen WH (2004) Disturbance observer based control for nonlinear systems. IEEE/ASME Trans Mechatron 9(4):706–710
Acknowledgements
Acknowledgment This research was supported by Unmanned Vehicles Core Technology Research and Development Program through the National Research Foundation of Korea (NRF) and Unmanned Vehicle Advanced Research Center (UVARC) funded by the Ministry of Science and ICT, the Republic of Korea (No. NRF2020M3C1C1A0108316111).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jeong, ET., Lee, CH. (2023). Path-Following Guidance Using Model Predictive Path Integral Control. In: Lee, S., Han, C., Choi, JY., Kim, S., Kim, J.H. (eds) The Proceedings of the 2021 Asia-Pacific International Symposium on Aerospace Technology (APISAT 2021), Volume 2. APISAT 2021. Lecture Notes in Electrical Engineering, vol 913. Springer, Singapore. https://doi.org/10.1007/978-981-19-2635-8_23
Download citation
DOI: https://doi.org/10.1007/978-981-19-2635-8_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2634-1
Online ISBN: 978-981-19-2635-8
eBook Packages: EngineeringEngineering (R0)