Skip to main content

Model-Free Reinforcement Learning-Based Control for Continuous-Time Systems

  • Living reference work entry
  • First Online:
Encyclopedia of Systems and Control
  • 278 Accesses

Abstract

This book entry discusses model-free reinforcement learning algorithms based on a continuous-time Q-learning framework. The presented schemes solve infinite-horizon optimization problems of linear time-invariant systems with completely unknown dynamics and single or multiple players/controllers. We first formulate the appropriate Q-functions (action-dependent value functions) and the tuning laws based on actor/critic structures for several cases including optimal regulation; Nash games, multi-agent systems, and cases with intermittent feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Bibliography

  • Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  • Cao X-R (2007) Stochastic learning and optimization. Springer US, New York

    Book  Google Scholar 

  • Heemels W, Johansson KH, Tabuada P (2012) An introduction to event-triggered and self-triggered control. In: 2012 IEEE 51st IEEE conference on decision and control (CDC), pp 3270–3285. IEEE

    Google Scholar 

  • Hespanha JP, Naghshtabrizi P, Xu Y (2007) A survey of recent results in networked control systems. Proc IEEE 95(1):138–162

    Article  Google Scholar 

  • Hovakimyan N, Cao C (2010) L1 adaptive control theory: guaranteed robustness with fast adaptation. Advances in design and control. Society for Industrial and Applied Mathematics (SIAM), Philadelphia

    Google Scholar 

  • Ioannou P, Fidan B (2006) Adaptive control tutorial. Advances in design and control. Society for Industrial and Applied Mathematics (SIAM), Philadelphia

    Google Scholar 

  • Kontoudis GP, Vamvoudakis KG (2019) Kinodynamic motion planning with continuous-time Q-learning: an online, model-free, and safe navigation framework. IEEE Trans Neural Netw Learn Syst 30(12): 3803–3817

    Article  MathSciNet  Google Scholar 

  • Krstić M, Kanellakopoulos I (1995) Nonlinear and adaptive control design. Adaptive and learning systems for Signal processing, communication and control. Wiley, New York

    Google Scholar 

  • Liu D, Wei Q, Wang D, Yang X, Li H (2017) Adaptive dynamic programming with application in optimal control. Springer-Verlag, London

    Book  Google Scholar 

  • Sutton RS, Barto AG (2018) Reinforcement learning: an introduction vol 1, 2nd edn. MIT Press, Cambridge

    MATH  Google Scholar 

  • Tao G (2003) Adaptive control design and analysis. Adaptive and cognitive dynamic systems: signal processing, learning, communication and control. Wiley, New York

    Google Scholar 

  • Vamvoudakis KG (2014) Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J Automat Sin 1(3):282–293

    Article  Google Scholar 

  • Vamvoudakis KG (2015) Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems. Automatica 61:274–281

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis KG (2016) Optimal trajectory output tracking control with a Q-learning algorithm. In: 2016 American control conference (ACC), pp 5752–5757

    Google Scholar 

  • Vamvoudakis KG (2017a) Q-learning for continuous-time graphical games on large networks with completely unknown linear system dynamics. Int J Robust Nonlinear Control 27(16):2900–2920

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis KG (2017b) Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Control Lett 100:14–20

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis KG, Ferraz H (2016) Event-triggered h-infinity control for unknown continuous-time linear systems using Q-learning. In: 2016 IEEE 55th conference on decision and control (CDC), pp 1376–1381

    Google Scholar 

  • Vamvoudakis KG, Ferraz H (2018) Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance. Automatica 87:412–420

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis KG, Hespanha JP (2018) Cooperative Q-learning for rejection of persistent adversarial inputs in networked linear quadratic systems. IEEE Trans Autom Control 63:1018–1031

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis K, Antsaklis P, Dixon W, Hespanha J, Lewis F, Modares H, Kiumarsi B (2015) Autonomy and machine intelligence in complex systems: a tutorial. In: American control conference (ACC), pp 5062–5079

    Google Scholar 

  • Vamvoudakis KG, Mojoodi A, Ferraz H (2017a) Event-triggered optimal tracking control of nonlinear systems. Int J Robust Nonlinear Control 27(4):598–619

    Article  MathSciNet  Google Scholar 

  • Vamvoudakis KG, Modares H, Kiumarsi B, Lewis FL (2017b) Game theory-based control system algorithms with real-time reinforcement learning: how to solve multiplayer games online. IEEE Control Syst Mag 37:33–52

    MathSciNet  Google Scholar 

  • Watkins C (1989) Learning from delayed rewards. Ph.D. thesis, Cambridge University, Cambridge, England

    Google Scholar 

  • Yang Y, Vamvoudakis KG, Modares H, Ding D, Yin Y, Wunsch DC (2018a) Dynamic intermittent suboptimal control: performance quantification and comparisons. In: 2018 37th Chinese control conference (CCC), pp 2017–2022

    Google Scholar 

  • Yang Y, Modares H, Vamvoudakis KG, Yin Y, Wunsch DC (2018b) Model-free event-triggered containment control of multi-agent systems. In: 2018 annual American control conference (ACC), pp 877–884

    Google Scholar 

  • Yang Y, Vamvoudakis KG, Ferraz H, Modares H (2019) Dynamic intermittent Q-learning-based model-free suboptimal co-design of -stabilization. Int J Robust Nonlinear Control 29(9):2673–2694

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This material is based upon the work supported by NSF under Grant Numbers CPS-1851588, SATC-1801611, and S&AS-1849198, by ARO under Grant Number W911NF1910270, and by Minerva Research Initiative under Grant Number N00014-18-1-2160.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyriakos G. Vamvoudakis .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer-Verlag London Ltd., part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Vamvoudakis, K.G. (2020). Model-Free Reinforcement Learning-Based Control for Continuous-Time Systems. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_100065-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5102-9_100065-1

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5102-9

  • Online ISBN: 978-1-4471-5102-9

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics