Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise

Yang, Xue; Liu, Shujun

doi:10.1007/s11424-023-1352-4

Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise

Published: 19 April 2023

Volume 36, pages 591–612, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Systems Science and Complexity Aims and scope Submit manuscript

Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise

Download PDF

Xue Yang¹ &
Shujun Liu¹

111 Accesses
Explore all metrics

Abstract

The optimal control problem with a long run average cost is investigated for unknown linear discrete-time systems with additive noise. The authors propose a value iteration-based stochastic adaptive dynamic programming (VI-based SADP) algorithm, based on which the optimal controller is obtained. Different from the existing relevant work, the algorithm does not need to estimate the expectation (conditional expectation) and variance (conditional variance) of states or other relevant variables, and the convergence of the algorithm can be proved rigorously. A simulation example is given to verify the effectiveness of the proposed approach.

Article PDF

An adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems

Article 05 April 2023

Infinite horizon indefinite stochastic linear quadratic control for discrete-time systems

Article 07 August 2015

Discrete-time inverse linear quadratic optimal control over finite time-horizon under noisy output measurements

Article Open access 15 November 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Lewis F L, Vrabie D L, and Syrmos V L, Optimal Control, John Wiley & Sons Inc., Hoboken, 2012.
Book MATH Google Scholar
Guo J, Zhang J F, and Zhao Y L, Adaptive tracking of a class of first-order systems with binary-valued observations and fixed thresholds, Journal of Systems Science and Complexity, 2012, 25(6): 1041–1051.
Article MathSciNet MATH Google Scholar
Jiang Y and Jiang Z P, A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise, Journal of Systems Science and Complexity, 2015, 28(2): 261–288.
Article MathSciNet MATH Google Scholar
Chen H F, Noisy observation based stabilization and optimization for unknown systems, Journal of Systems Science and Complexity, 2003, 16(3): 315–326.
MathSciNet MATH Google Scholar
Tang Q Y and Chen H F, Optimal adaptive control with constraint for ARMAX model, Journal of Systems Science and Complexity, 1991, 4(3): 254–263.
MathSciNet MATH Google Scholar
Li X X, Peng Z H, Jiao L, et al., Online adaptive Q-learning method for fully cooperative linear quadratic dynamic games, Science China Information Sciences, 2019, 62(12): 1–14.
Article MathSciNet Google Scholar
Kiumarsi B, Lewis F L, and Jiang Z P, H_∞ control of linear discrete-time systems: Off-policy reinforcement learning, Automatica, 2017, 78: 144–152.
Article MathSciNet MATH Google Scholar
Lewis F L and Vamvoudakis K G, Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 14–25.
Article Google Scholar
Rizvi S A A and Lin Z L, Output feedback Q-learning control for the discrete-time linear quadratic regulator problem, IEEE Transactions on Neural Networks and Learning Systems, 2018, 30(5): 1523–1536.
Article MathSciNet Google Scholar
Kiumarsi B, Lewis F L, Modares H, et al., Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica, 2014, 50(4): 1167–1175.
Article MathSciNet MATH Google Scholar
Jiang Y, Fan J L, Chai T Y, et al., Tracking control for linear discrete-time networked control systems with unknown dynamics and dropout, IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(10): 4607–4620.
Article MathSciNet Google Scholar
He P and Jagannathan S, Reinforcement learning-based output feedback control of nonlinear systems with input constraints, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2005, 35(1): 150–154.
Article Google Scholar
Wei Q L and Liu D R, A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems, Science China Information Sciences, 2015, 58(12): 1–15.
Article MathSciNet Google Scholar
Wang D, Liu D R, Li H L, et al., An approximate optimal control approach for robust stabilization of a class of discrete-time nonlinear systems with uncertainties, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2015, 46(5): 713–717.
Article Google Scholar
Liu R R, Li Y, and Liu X K, Linear-quadratic optimal control for unknown mean-field stochastic discrete-time system via adaptive dynamic programming approach, Neurocomputing, 2018, 282: 16–24.
Article Google Scholar
Liu X K, Liu R R, and Li Y, Infinite time linear quadratic stackelberg game problem for unknown stochastic discrete-time systems via adaptive dynamic programming approach, Asian Journal of Control, 2021, 23(2): 937–948.
Article MathSciNet Google Scholar
Gravell B, Ganapathy K, and Summers T, Policy iteration for linear quadratic games with stochastic parameters, IEEE Control Systems Letters, 2020, 5(1): 307–312.
Article MathSciNet Google Scholar
Wang J S and Yang G H, Output-feedback control of unknown linear discrete-time systems with stochastic measurement and process noise via approximate dynamic programming, IEEE Transactions on Cybernetics, 2017, 48(7): 1977–1988.
Article Google Scholar
Han K Z, Feng J, and Yao Y, An integrated data-driven Markov parameters sequence identification and adaptive dynamic programming method to design fault-tolerant optimal tracking control for completely unknown model systems, Journal of the Franklin Institute, 2017, 354(13): 5280–5301.
Article MathSciNet MATH Google Scholar
Wong W C and Lee J H, A reinforcement learning-based scheme for direct adaptive optimal control of linear stochastic systems, Optimal Control Applications and Methods, 2010, 31(4): 365–374.
Article MathSciNet MATH Google Scholar
Yaghmaie F A and Gustafsson F, Using reinforcement learning for model-free linear quadratic control with process and measurement noises, Proceedings of the 58th IEEE Conference on Decision and Control (CDC), Nice, France, Dec. 11–13, 2019, 6510–6517.
Google Scholar
Abbasi-Yadkori Y, Lazić N, and Szepesvári C, Model-free linear quadratic control via reduction to expert prediction, Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), Okinawa, Japan, Apr. 16–18, 2019, 3108–3117.
Google Scholar
Xu X, Chen H, Lian C Q, et al., Learning-based predictive control for discrete-time nonlinear systems with stochastic disturbances, IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(12): 6202–6213.
Article MathSciNet Google Scholar
Liang M M, Wang D, and Liu D R, Neuro-optimal control for discrete stochastic processes via a novel policy iteration algorithm, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 50(11): 3972–3985.
Article Google Scholar
Liang M M, Wang D, and Liu D R, Improved value iteration for neural-network-based stochastic optimal control design, Neural Networks, 2020, 124: 280–295.
Article MATH Google Scholar
M’sahli F, Fayeche C, Abdennour R B, et al., Application of adaptive controllers for the temperature control of a semi-batch reactor, International Journal of Computational Engineering Science, 2001, 2(2): 287–307.
Article MATH Google Scholar
Haas S M, Frei M G, Osorio I, et al., EEG ocular artifact removal through ARMAX model system identification using extended least squares, Communications in Information and Systems, 2003, 3(1): 19–40.
Article MathSciNet MATH Google Scholar
Deisenroth M P, Fox D, and Rasmussen C E, Gaussian processes for data-efficient learning in robotics and control, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 37(2): 408–423.
Article Google Scholar
Sethi S P, Suo W, Taksar M I, et al., Optimal production planning in a multi-product stochastic manufacturing system with long-run average cost, Discrete Event Dynamic Systems, 1998, 8(1): 37–54.
Article MathSciNet MATH Google Scholar
Borkar V S, Ergodic control of diffusion processes, Proceedings of the International Congress of Mathematicians (ICM), Madrid, Spain, 2006: 1299–1309.
Chen H F and Guo L, Optimal stochastic adaptive control with quadratic index, International Journal of Control, 1986, 43(3): 869–881.
Article MATH Google Scholar
Chen H F and Guo L, Stochastic adaptive control for a general quadratic cost, Journal of Systems Science and Mathematical Sciences, 1987, 7(4): 289–302.
MathSciNet MATH Google Scholar
Guo L, Self-convergence of weighted least-squares with applications to stochastic adaptive control, IEEE Transactions on Automatic Control, 1996, 41(1): 79–89.
Article MathSciNet MATH Google Scholar
Sutton R S, Barto A G, and Williams R J, Reinforcement learning is direct adaptive optimal control, IEEE Control Systems Magazine, 1992, 12(2): 19–22.
Article Google Scholar
Ma C Q, Li T, and Zhang J F, Linear quadratic decentralized dynamic games for large population discrete-time stochastic multi-agent systems, Journal of Systems Science and Mathematical Sciences, 2007, 27(3): 464–480.
MathSciNet MATH Google Scholar
Chen H F and Guo L, Identification and Stochastic Adaptive Control, Springer Science & Business Media, New York, 1991.
Book MATH Google Scholar
Gao W N, Jiang Y, Jiang Z P, et al., Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming, Automatica, 2016, 72: 37–45.
Article MathSciNet MATH Google Scholar
Lancaster P and Rodman L, Algebraic Riccati Equations, Oxford University Press Inc., New York, 1995.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, Sichuan University, Chengdu, 610064, China
Xue Yang & Shujun Liu

Authors

Xue Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shujun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shujun Liu.

Additional information

This research was supported by the National Natural Science Foundation of China under Grant No. 61673284 and the Science Development Project of Sichuan University under Grant No. 2020SCUNL201.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, X., Liu, S. Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise. J Syst Sci Complex 36, 591–612 (2023). https://doi.org/10.1007/s11424-023-1352-4

Download citation

Received: 18 March 2021
Revised: 16 September 2021
Published: 19 April 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11424-023-1352-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise

Abstract

Article PDF

Similar content being viewed by others

An adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems

Infinite horizon indefinite stochastic linear quadratic control for discrete-time systems

Discrete-time inverse linear quadratic optimal control over finite time-horizon under noisy output measurements

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal Control of Unknown Discrete-Time Linear Systems with Additive Noise

Abstract

Article PDF

Similar content being viewed by others

An adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems

Infinite horizon indefinite stochastic linear quadratic control for discrete-time systems

Discrete-time inverse linear quadratic optimal control over finite time-horizon under noisy output measurements

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation