Cyclic error correction based Q-learning for mobile robots navigation

Tang, Rongkuan; Yuan, Hongliang

doi:10.1007/s12555-015-0392-5

Cyclic error correction based Q-learning for mobile robots navigation

Regular Papers
Robot and Applications
Published: 27 June 2017

Volume 15, pages 1790–1798, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Cyclic error correction based Q-learning for mobile robots navigation

Download PDF

Rongkuan Tang¹ &
Hongliang Yuan¹

123 Accesses
7 Citations
Explore all metrics

Abstract

Similar to control systems, reinforcement learning can capture notions of optimal behavior using natural interaction experience. In the context of reinforcement learning, the temporal difference error of the generated experience measures how well the learner responds to the system. Specially sequential difference of accumulated temporal difference error can indicate the learning performance. In this paper, we fully utilize the error correction in closed-loop peculiarity by mapping a representation error to the step-size component. The proposed cyclic step-size could better control how new estimates are iteratively blended together over time, and the new estimates guide the action selection process which in turn influence the value distribution. To guide more promising action decision, an ensemble action selector is proposed which incorporates the idea of ensemble wisdom of the weak. Experimental results conducted under gridworld mobile robot navigation task demonstrate the validity, capacity of fast learning and easy-plugged implementation of the derived algorithm, leading to increasing applicability to real-life problems.

Article PDF

ETQ-learning: an improved Q-learning algorithm for path planning

Article 26 June 2024

Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment

Article 30 September 2020

Automatic Navigation Research for Multi-directional Mobile Robots on the Basis of Artificial Intelligence Application, Q-Learning Algorithm

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

M. Hutter and S. Scanner, Recent Advances in Reinforcement Learning, Springer, New York, 2012.
Google Scholar
R. Sutton and A. Barto, Reinforcement Learning: An introduction, MIT Press, Cambridge, MA, 1998.
Google Scholar
C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3, pp. 279–292, 1992. [click]
MATH Google Scholar
R. Coulom, Reinforcement Learning Using Neural Networks, with Applications to Motor Control, Institut National Polytechnique de Grenoble-INPG, 2002.
MATH Google Scholar
J. Kober, J. Bagnell, and J. Peters, “Reinforcement learning in robotics: a survey,” International Journal of Robotics Research, vol. 32, no. 11, pp. 1238–1274, 2013.
Article Google Scholar
B. Zuo, J. Chen, L. Wang, and Y. Wang, “A reinforcement learning based robotic navigation system” Proc. 2014 IEEE International Conference on. Systems, Man and Cybernetics (SMC), pp. 3452–3457 2014.
Chapter Google Scholar
J. Millan and C. Torras, “Learning to avoid obstacles through reinforcement” Proc. the 8th International Workshop on Machine Learning, pp. 298–302 2014.
Google Scholar
A. Gosavi, Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning, Springer, US, 2014.
MATH Google Scholar
A. Gosavi, “On step sizes, stochastic shortest paths, and survival probabilities in reinforcement learning” Proc. the 40th Conference on Winter Simulation, pp. 525–531 2008.
Google Scholar
K. Moriyama, “Learning-rate adjusting Q-learning for prisoner’s dilemma games” IEEE/WIC/ACM International Conference onWeb Intelligence and Intelligent Agent Technology, WI-IAT’08, pp. 322–325 2008.
Chapter Google Scholar
A. Mahmood, R. Sutton, T. Degris, and P. Pilarski, “Tuning-free step-size adaptation” Proc. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2121–2124 2012.
Chapter Google Scholar
E. Even-Dar and Y. Mansour, “Learning rates for Qlearning,” The Journal of Machine Learning Research, vol. 5, pp. 1–25, 2004.
MATH Google Scholar
S. Lee, I. Suh, and W. Kwon, “A motivation-based actionselection-mechanism involving reinforcement learning,” Int. J. of Control, Automation, and Systems, vol. 6, no. 6, pp. 904–914, 2008.
Google Scholar
M. Tokic and G. Palm, “Value-difference based exploration: adaptive control between epsilon-greedy and softmax” Advances in Artificial Intelligence, Springer Berlin Heidelberg, pp. 335–346 2011.
Google Scholar
M. Guo, Y. Liu, and J. Malec, “A new Q-learning algorithm based on the metropolis criterion,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 34, no. 5, pp. 2140–2143, 2004. [click]
Article Google Scholar
F. Lewis, D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits and Systems Magazine, vol. 9, no. 3, pp. 32–50, 2009. [click]
Article Google Scholar
M. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, 2014.
MATH Google Scholar
M. Littman, “Reinforcement learning improves behaviour from evaluative feedback,” Nature, vol. 521, no. 7553, pp. 445–451, 2015. [click]
Article Google Scholar
C. Szepesvari, “Algorithms for reinforcement learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 4, no. 1, pp. 1–103, 2010.
Article MATH Google Scholar
C. Zhang and Y. Ma, Ensemble Machine Learning, Springer, New York, 2012.
Book MATH Google Scholar
M. Mendoza and A. Bazzan, “The wisdom of crowds in bioinformatics: what can we learn (and gain) from ensemble predictions?” Proc. the 27th AAAI Conference on Artificial Intelligence, pp. 1678-1679, 2013.
C. Bishop, Pattern Recognition and Machine Learning, Springer, New York, 2006.
MATH Google Scholar
D. Borrajo and L. Parker, “A reinforcement learning algorithm in cooperative multi-robot domains,” Journal of Intelligent and Robotic Systems, vol. 43, no. 2-4, pp. 161–174, 2005.
Article Google Scholar
L. Panait and S. Luke, “Cooperative multi-agent learning: The state of the art,” Autonomous Agents and Multi-Agent Systems, vol. 11, no. 3, pp. 387–434, 2005. [click]
Article Google Scholar
W. Burgard, M. Moors, C. Stachniss, and F. Schneider, “Coordinated multi-robot exploration,” IEEE Transactions on Robotics, vol. 21, no. 3, pp. 376–386, 2005. [click]
Article Google Scholar
Y. Li, L. Chen L, K. Tee, and Q. Li, “Reinforcement learning control for coordinated manipulation of multi-robots,” Neurocomputing, vol. 170, pp. 168–175, 2015. [click]
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Electronics and Information, Tongji University, Shanghai, 201804, China
Rongkuan Tang & Hongliang Yuan

Authors

Rongkuan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Hongliang Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongliang Yuan.

Additional information

Recommended by Associate Editor Huaping Liu under the direction of Editor Fuchun Sun. This work is supported by the National Natural Science Foundation of China under Grant No. 61273327.

Rongkuan Tang received his B.S. degree in Automation from Changshu Institute of Technology in 2013. He received his M.S. degree in Control Science and Engineering from Tongji University in 2016. His research interests include reinforcement learning and its application to robots.

Hongliang Yuan received his B.S. degree in Automatic Control from University of Science and Technology of China in 2002. He received his M.S. and Ph.D degrees in Electrical Engineering from University of Central Florida, USA, in 2006 and 2009, respectively. He is currently an associate professor at the School of Electronics and Information Engineering, Tongji University, P.R. China. His research interests include robotics, intelligent control, optimizations, vehicle dynamics and control, etc.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, R., Yuan, H. Cyclic error correction based Q-learning for mobile robots navigation. Int. J. Control Autom. Syst. 15, 1790–1798 (2017). https://doi.org/10.1007/s12555-015-0392-5

Download citation

Received: 28 October 2015
Revised: 27 August 2016
Accepted: 01 October 2016
Published: 27 June 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s12555-015-0392-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Cyclic error correction based Q-learning for mobile robots navigation

Abstract

Article PDF

Similar content being viewed by others

ETQ-learning: an improved Q-learning algorithm for path planning

Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment

Automatic Navigation Research for Multi-directional Mobile Robots on the Basis of Artificial Intelligence Application, Q-Learning Algorithm

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation