Abstract
Two-time-scale (TTS) systems were proposed to describe accurately complex systems that include multiple variables running on two-time scales. Different response speeds of variables and incomplete model information affect tracking performance of TTS systems. For tracking control of unknown model, practicability of reinforcement learning (RL) has been subject to criticism, as the method requires stable initial policy. Based on singular perturbation theory (SPT), a composite sub-optimal tracking policy is investigated combining model information with measured data. Besides, a selection criterion of initial stabilizing policy is presented by considering the policy as an input constraint. The proposed method integrating RL technique with convex optimization improves the tracking performance and practicability effectively. Finally, an emulation experiment in F-8 aircraft is given to demonstrate the validity of the developed method.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
A. Raza, F. M. Malik, N. Mazhar, and R. Khan, “Two-time-scale robust output feedback control for aircraft longitudinal dynamics via sliding mode control and high-gain observer,” Alexandria Engineering Journal, vol. 61, no. 6, pp. 4573–4583, October 2022.
N. Daroogheh, N. Meskin, and K. Khorasani, “Ensemble kalman filters for state estimation and prediction of two-time scale nonlinear systems with application to gas turbine engines,” IEEE Transactions on Control Systems Technology, vol. 27, no. 6, pp. 2565–2573, September 2019.
J. Yang, P. Si, Z. Wang, X. Jiang, and L. Hanzo, “Dynamic resource allocation and layer selection for scalable video streaming in femtocell networks: A twin-time-scale approach,” IEEE Transactions on Communications, vol. 66, no. 8, pp. 3455–3470, August 2018.
J. Kim, U. Jon, and H. Lee. “State-constrained suboptimal tracking controller for continuous-time linear timeinvariant (CT-LTI) systems and its application for DC motor servo systems,” Applied Sciences, vol. 10, no. 16, pp. 5724–5741, August 2020.
G. B. Avanzini, A. Zanhettin, and P. Rocco, “Constrained model predictive control for mobile robotic manipulators,” Robotica, vol. 36, no. 1, pp. 19–38, April 2018.
V. R. Saksena, J. Oreilly, and P. V. Kokotovic, “Singular perturbations and time-scale methods in control theory: Survey 1976–1983,” Automatica, vol. 20, no. 3, pp. 273–293, May 1984.
V. Dragan. “On the linear quadratic optimal control for systems described by singularly perturbed it differential equations with two fast time scales,” Axioms, vol. 8, no. 1, pp. 1–30, March 2019.
W. Chen, Y. Liu, and W. X. Zheng, “Synchronization analysis of two-time-scale nonlinear complex networks with time-scale-dependent coupling,” IEEE Transactions on Cybernetics, vol. 49, no. 9, pp. 3255–3267, September 2019.
W. Xue, J. Fan, V. G. Lopez, J. Li, Y. Jiang, T. Chai, and F. L. Lewis, “New Methods for Optimal Operational Control of Industrial Processes Using Reinforcement Learning on Two Time Scales,” IEEE Transactions on Industrial Informatics, vol. 16, no. 5, pp. 3085–3099, May 2020.
W. Xue, J. Fan, V. G. Lopez, Y. Jiang, T. Chai, and F. L. Lewis, “Off-Policy Reinforcement Learning for Tracking in Continuous-Time Systems on Two Time Scales,” IEEE Transactions on Neural Networks and Learning System, vol. 32, no. 10, pp. 4334–4346, October 2021.
R. Sutton, A. Barto, Reinforcement Learning - An Introduction, MIT Press, Cambridge, 1998.
X. Wu and C. Wang, “Model-free optimal tracking control for an aircraft skin inspection robot with constrained-input and input time-delay via integral reinforcement learning,” International Journal of Control, Automation, and Systems, vol. 18, pp. 245–257, January 2020.
Y. Yang, K. G. Vamvoudakis, H. Modares, Y. Yin, and D. C. Wunsch, “Hamiltonian-driven hybrid adaptive dynamic programming,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 10, pp. 6423–6434, October 2021.
T. Lindner, A. Milecki, and D. Wyrwa, “Positioning of the robotic arm using different reinforcement learning algorithms,” International Journal of Control, Automation, and Systems, vol. 19, pp. 1661–1676, April 2021.
V. Vu, Q. Tran, T. Pham, and P. N. Dao, “Online actor-critic reinforcement learning control for uncertain surface vessel systems with external disturbances,” International Journal of Control, Automation, and Systems, vol. 20, pp. 1029–1040, March 2022.
Y. Peng, Q. Chen, and W. Sun, “Reinforcement Q-learning algorithm for H infinite tracking control of unknown discrete-time linear systems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 11, pp. 4109–4122, November 2020.
L. Zhou, J. Zhao, L. Ma, and C. Yang, “Decentralized composite suboptimal control for a class of two-time-scale interconnected networks with unknown slow dynamics,” Neurocomputing, vol. 383, no. 21, pp. 71–79, March 2020.
M. Sayak, B. He, and C. Aranya, “Reduced-dimensional reinforcement learning control using singular perturbation approximations,” Automatica, vol. 126, no. 21, pp. 1–11, April 2021.
K. Bahare, F. L. Lewis, M. Hamidreza, A. Karimpour, and M.-B. Naghibi-Sistani, “Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics,” Automatica, vol. 50, no. 4, pp. 1167–1175, April 2014.
Y. Jiang, J. Fan, T. Chai, F. L. Lewis, and J. Li, “Tracking control for linear discrete-time networked control systems with unknown dynamics and dropout,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 10, pp. 4607–4620, October. 2018.
S. A. A. Rizvi, A. J. Pertzborn, and Z. Lin, “Reinforcement learning based optimal tracking control under unmeasurable disturbances with application to HVAC systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 15, no. 4, pp. 1–11, June 2021.
X. F. Li, L. Xue, and C. Y. Sun, “Linear quadratic tracking control of unknown discrete-time systems using value iteration algorithm,” Neurocomputing, vol. 314, no. 7, pp. 86–93, November 2018.
Y. Jiang, J. Fan, T. Chai, and F. L. Lewis, “Dual-rate operational optimal control for flotation industrial process with unknown operational model,” IEEE Transactions on Industrial Electronics, vol. 66, no. 6, pp. 4587–4599, June 2019.
J. Li, B. Kiumarsi, T. Chai, F. L. Lewis, and J. Fan, “Off-policy reinforcement learning: optimal operational control for two-time-scale industrial processes,” IEEE Transactions on Cybernetics, vol. 47, no. 12, pp. 4547–4558, December 2017.
G. Gu, Discrete-time Linear Systems: Theory and Design with Applications, Springer, New York, NY, USA, 2012.
P. Kokotovic, H. K. Khalil, and J. Oreilly, Singular Perturbation Methods in Control: Analysis and Design, Society for Industrial and Mathematics, Philadelphia, PA, 1999.
V. Mayuresh, “Robust constrained model predictive control using linear matrix inequalities,” Automatica, vol. 32, no. 10, pp. 1361–1379, February 1996.
K. R. Muske, “Model predictive contro with linear models,” AIChE Journal, vol. 49, no. 9, pp. 3255–3267, September 1993.
D. Lee and J. Hu, “Primal-dual Q-learning framework for LQR design,” IEEE Transactions on Automatic Control, vol. 64, no. 9, pp. 3756–3763, September 2019.
F. Zhang, The Schur Complement and Its Applications, vol. 4, Springer, New York, NY, USA, 2006.
S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
B. Litkouhi and H. Khalil, “Multirate and composite control of two-time-scale discrete-time systems,” IEEE Transactions on Automatic Control, vol. 30, no. 7, pp. 645–651, July 1985.
J. Elliott, “NASA’s advanced control law program for the F-8 digital fly-by-wire aircraft,” IEEE Transactions on Automatic Control, vol. 22, no. 5, pp. 753–757, October 1977.
P. V. Kokotovi, Singular Perturbation Methods in Control: Analysis and Design, London, 1986.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that there is no competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by National Natural Science Foundation of China (Basic Science Center Program: 61988101), Natural Science Foundation of China (62233005, 62273149), Fundamental Research Funds for the Central Universities and Shanghai AI Lab.
Xuejie Que received her M.S. degree in applied mathematics from the Zhengzhou University, Zhengzhou, China, in 2019. She is currently pursuing a Ph.D. degree in Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China. Her research interests include multi-time-scale systems, optimal control, and reinforcement learning.
Zhenlei Wang is currently a Professor with the School of Information Science and Engineering, East China University of Science and Technology and also with the Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education. His research interests include intelligent control, modeling and analysis the characteristic of complex systems, intelligent optimization algorithms, and fault diagnosis.
Xin Wang is currently an Associate Professor in Shanghai Jiao Tong University, China. His research interests include multi-variable intelligent decoupling control, control and optimization of complex industrial processes, and multiple model adaptive control.
Rights and permissions
About this article
Cite this article
Que, X., Wang, Z. & Wang, X. Reinforcement Learning for Input Constrained Sub-optimal Tracking Control in Discrete-time Two-time-scale Systems. Int. J. Control Autom. Syst. 21, 3068–3079 (2023). https://doi.org/10.1007/s12555-022-0355-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12555-022-0355-6