Abstract
A deep reinforcement learning approach for solving the quadrotor path following and obstacle avoidance problem is proposed in this paper. The problem is solved with two agents: one for the path following task and another one for the obstacle avoidance task. A novel structure is proposed, where the action computed by the obstacle avoidance agent becomes the state of the path following agent. Compared to traditional deep reinforcement learning approaches, the proposed method allows to interpret the training process outcomes, is faster and can be safely trained on the real quadrotor. Both agents implement the Deep Deterministic Policy Gradient algorithm. The path following agent was developed in a previous work. The obstacle avoidance agent uses the information provided by a low-cost LIDAR to detect obstacles around the vehicle. Since LIDAR has a narrow field-of-view, an approach for providing the agent with a memory of the previously seen obstacles is developed. A detailed description of the process of defining the state vector, the reward function and the action of this agent is given. The agents are programmed in python/tensorflow and are trained and tested in the RotorS/gazebo platform. Simulations results prove the validity of the proposed approach.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Caicedo, J.C., Lazebnik, S.: Active object localization with deep reinforcement learning, in. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2488–2496 (2015)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T.P., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016)
Yu, R., Shi, Z., Huang, C., Li, T., Ma, Q.: Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle. In: 2017 36th Chinese Control Conference (CCC), pp. 4958–4965 (2017)
Tuyen, L.P., Chung, T.: Controlling bicycle using deep deterministic policy gradient algorithm. In: 2017 14TH International Conference on Ubiquitous Robots and Ambien Intelligence (Urai), pp. 413–417 (2017)
Li, L., Lv, Y., Wang, F.: Traffic signal timing via deep reinforcement learning. IEEE/CAA J. Automatica Sinica 3(3), 247–254 (2016)
Rubí, B., Morcego, B., Pérez, R.: Deep reinforcement learning for quadrotor path following with adaptive velocity. Auton. Robot. 45, 119–134 (2020)
Goerzen, C., Kong, Z., Mettler, B.: A survey of motion planning algorithms from the perspective of autonomous uav guidance. J. Intell. Robot. Syst. 57(1), 65 (2009)
Huang, S., Teo, R.S.H., Tan, K.K.: Collision avoidance of multi unmanned aerial vehicles: A review. Ann. Rev. Control 48, 147–164 (2019)
Yasin, J.N., Mohamed, S.A.S., Haghbayan, M., Heikkonen, J., Tenhunen, H., Plosila, J.: Unmanned aerial vehicles (uavs): Collision avoidance systems and approaches. IEEE Access 8, 105139–105155 (2020)
Minguez, J., Lamiraux, F., Laumond, J. -P.: Motion Planning and Obstacle Avoidance. In: Springer Handbook Of Robotics, pp 827–852. Springer, Berlin (2008)
Ma, Z., Wang, C., Niu, Y., Wang, X., Shen, L.: A saliency-based reinforcement learning approach for a uav to avoid flying obstacles. Robot. Auton. Syst. 100, 108–118 (2018)
Eslamiat, H., Li, Y., Wang, N., Sanyal, A.K., Qiu, Q.: Autonomous waypoint planning, optimal trajectory generation and nonlinear tracking control for multi-rotor uavs. In: 2019 18TH European Control Conference (ECC), pp. 2695–2700 (2019)
Yang, S., Meng, Z., Chen, X., Xie, R.: Real-time obstacle avoidance with deep reinforcement learning three-dimensional autonomous obstacle avoidance for uav. In: Proceeding of the 2019 International Conference on Robotics, Intelligent Control and Artificial Intelligence (2019)
Wu, K., Esfahani, M.A., Yuan, S., Wang, H.: Depth-based obstacle avoidance through deep reinforcement learning. In: Proceedings of the 5th International Conference on Mechatronics and Robotics Engineering, pp. 102–106 (2019)
Yan, C., Xiang, X., Wang, C.: Towards real-time path planning through deep reinforcement learning for a uav in dynamic environments. J. Intell. Robot. Syst 98, 297–309 (2020)
Han, X., Wang, J., Xue, J., Zhang, Q.: Intelligent decision-making for 3-dimensional dynamic obstacle avoidance of uav based on deep reinforcement learning. In: 2019 11th International Conference on Wireless Communications and Signal Processing (Wcsp), pp. 1–6 (2019)
Cheng, Y., Zhang, W.: Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels. Neurocomputing 272, 63–73 (2018)
Lu, X., Cao, Y., Zhao, Z., Yan, Y.: Deep reinforcement learning based collision avoidance algorithm for differential drive robot. In: Intelligent Robotics and Applications ICIRA 2018 (2018)
Li, B., Wu, Y.: Path planning for uav ground target tracking via deep reinforcement learning. IEEE Access 8, 29064–29074 (2020)
Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of uavs in large-scale complex environments: A deep reinforcement learning approach. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019)
Sampedro, C., Bavle, H., Rodriguez-Ramos, A., de la Puente, P., Campoy, P.: Laser-based reactive navigation for multirotor aerial robots using deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1024–1031 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M.A., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: 2016 International Conference on Learning Representations (ICLR) (2016)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning, pp. 387–395 (2014)
Rubí, B., Ruiz, A., Pérez, R., Morcego, B.: Path-flyer: A benchmark of quadrotor path following algorithms. In: 2019 15th IEEE International Conference on Controol and Automation (2019)
Aguiar, A.P., Hespanha, J.P.: Trajectory-tracking and path-following of underactuated autonomous vehicles with parametric modeling uncertainty. IEEE Trans. Autom. Control 52(8), 1362–1379 (2007)
Cabecinhas, D., Cunha, R., Silvestre, C.: Rotorcraft path following control for extended flight envelope coverage. In: Proceedings of the 48th IEEE Conference on Decision and Control, Held Jointly with the 28th Chinese Control Conference (CDC/CCC 2009), pp. 3460–3465 (2009)
Aguiar, A.P., Hespanha, J.P., Kokotovic, P.V.: Performance limitations in reference tracking and path following for nonlinear systems. Automatica 44(3), 598–610 (2008)
Kaminer, I., Yakimenko, O., Pascoal, A., Ghabcheloo, R.: Path generation, path following and coordinated control for time-critical missions of multiple UAVs. In: Proceedings of the American Control Conference, vol. 1–12, p 4906 (2006)
Furrer, F., Burri, M., Achtelik, M., Siegwart, R.: RotorS—A Modular Gazebo MAV Simulator Framework, pp 595–625. Springer International Publishing, Cham (2016)
Stevšić, S., Nägeli, T., Alonso-Mora, J., Hilliges, O.: Sample efficient learning of path following and obstacle avoidance behavior for quadrotors. IEEE Robot. Autom. Lett. 3(4), 3852–3859 (2018)
Park, J., Cho, N.: Collision avoidance of hexacopter uav based on lidar data in dynamic environment. Remote. Sens. 12, 6 (2020)
Al-Kaff, A., García, F., Martín, D., de la Escalera, A., Armingol, J.M.: Obstacle detection and avoidance system based on monocular camera and size expansion algorithm for uavs. Sensors (Basel, Switzerland) 17 (2017)
Amrouche, M., Marinho, T., Stipanović, D.: Vision based collision avoidance for multi-agent systems using avoidance functions. In: 2020 European Control Conference (ECC), pp. 1683–1688 (2020)
Funding
Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work has been partially funded by the Spanish Government (MINECO) through the project CICYT (ref. DPI2017-88403-R).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Rubí, B., Morcego, B. & Pérez, R. Quadrotor Path Following and Reactive Obstacle Avoidance with Deep Reinforcement Learning. J Intell Robot Syst 103, 62 (2021). https://doi.org/10.1007/s10846-021-01491-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01491-2