An Approach to Hierarchical Deep Reinforcement Learning for a Decentralized Walking Control Architecture

Schilling, Malte; Melnik, Andrew

doi:10.1007/978-3-319-99316-4_36

Malte Schilling¹⁵ &
Andrew Melnik¹⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 848))

Included in the following conference series:

Biologically Inspired Cognitive Architectures Meeting

667 Accesses
10 Citations

Abstract

Locomotion in animals is characterized as a stable, rhythmic behavior which at the same time is flexible and extremely adaptive. Many motor control approaches have taken considerable steps taking insights from biology. As one example, the Walknet approach for six-legged robots realizes a decentralized and modular structure that reflects insights from walking in stick insects. While this approach can deal with a variety of disturbances during locomotion, it is still limited dealing with novel and particular challenging walking situations. This has lead to a cognitive expansion that allows to test behaviors outside their original context and search for a solution in a form of internal simulation. What is still missing in this approach is the variation of lower level motor primitives themselves to cope with difficult situation and any form of learning. Here, we propose how this biologically-inspired approach can be extended in order to include a form of trial-and-error learning. The realization is currently underway and is based on a more broad formulation as a hierarchical reinforcement learning problem. Importantly, the structure of the hierarchy follows the decentralized organization taken from insects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Versatile modular neural locomotion control with fast learning

Article 14 February 2022

Bio-inspired Decentralized Architecture for Walking of a 5-link Biped Robot with Compliant Knee Joints

Article 30 October 2018

Hierarchical Decentralized Deep Reinforcement Learning Architecture for a Simulated Four-Legged Agent

References

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Cruse, H.: A quantitative model of walking incorporating central and peripheral influences. i. the control of the individual leg. Biol. Cybern. 37, 131–136 (1980)
Article Google Scholar
Cully, A., Clune, J., Tarapore, D., Mouret, J.B.: Robots that can adapt like animals. Nature 521(7553), 503–507 (2015). https://doi.org/10.1038/nature14422
Article Google Scholar
Florensa, C., Duan, Y., Abbeel, P.: Stochastic neural networks for hierarchical reinforcement learning. CoRR abs/1704.03012 (2017). http://arxiv.org/abs/1704.03012
Frans, K., Ho, J., Chen, X., Abbeel, P., Schulman, J.: Meta learning shared hierarchies. CoRR abs/1710.09767 (2017). http://arxiv.org/abs/1710.09767
Heess, N., Wayne, G., Tassa, Y., Lillicrap, T.P., Riedmiller, M.A., Silver, D.: Learning and transfer of modulated locomotor controllers. CoRR abs/1610.05182 (2016). http://arxiv.org/abs/1610.05182
Hoinville, T., Schilling, M., Cruse, H.: Control of rhythmic behavior: central and peripheral influences to pattern generation (2015)
Google Scholar
Holmes, P., Full, R.J., Koditschek, D., Guckenheimer, J.: The dynamics of legged locomotion: models, analyses, and challenges. SIAM Rev. 48(2), 207–304 (2006)
Article MathSciNet Google Scholar
Ijspeert, A.J.: Central pattern generators for locomotion control in animals and robots: a review. Neural Netw. 21(4), 642–653 (2008)
Article Google Scholar
Ijspeert, A.J., Crespi, A., Ryczko, D., Cabelguen, J.M.: From swimming to walking with a salamander robot driven by a spinal cord model. Science 315(5817), 1416–1420 (2007)
Article Google Scholar
Porta, J.M., Celaya, E.: Efficient gait generation using reinforcement learning. In: Proceedings of 4th International Conference on Climbing and Walking Robots (CLAWAR 2001), pp. 411–418 (2001)
Google Scholar
Kidzinski, L., Mohanty, S.P., Ong, C.F., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S.M., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Milos, P., Osinski, B., Melnik, A., Schilling, M., Ritter, H., Carroll, S.F., Hicks, J.L., Levine, S., Salathé, M., Delp, S.L.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. CoRR abs/1804.00361 (2018). http://arxiv.org/abs/1804.00361
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
McFarland, D., Bösser, T.: Intelligent Behavior in Animals and Robots. MIT Press, Cambridge (1993)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Nishimoto, R., Tani, J.: Development of hierarchical structures for actions and motor imagery: a constructivist view from synthetic neuro-robotics study. Psychol. Res. 73, 545–558 (2009)
Article Google Scholar
Schilling, M., Cruse, H.: What’s next: Recruitment of a grounded predictive body model for planning a robot’s actions. Front. Psychol. 3(383) (2012). https://doi.org/10.3389/fpsyg.2012.00383
Schilling, M., Cruse, H.: Reacog, a minimal cognitive controller based on recruitment of reactive systems. Front. Neurorobot. 11, 3 (2017). https://doi.org/10.3389/fnbot.2017.00003
Article Google Scholar
Schilling, M., Cruse, H., Arena, P.: Hexapod walking: an expansion to walknet dealing with leg amputations and force oscillations. Biol. Cybern. 96(3), 323–340 (2007)
Article Google Scholar
Schilling, M., Hoinville, T., Schmitz, J., Cruse, H.: Walknet, a bio-inspired controller for hexapod walking. Biol. Cybern. 107(4), 397–419 (2013)
Article MathSciNet Google Scholar
Schilling, M., Paskarbeit, J., Hoinville, T., Hüffmeier, A., Schneider, A., Schmitz, J., Cruse, H.: A hexapod walker using a heterarchical architecture for action selection. Front. Comput. Neurosci. 7, 126 (2013). https://doi.org/10.3389/fncom.2013.00126
Article Google Scholar
Schilling, M., Paskarbeit, J., Schmitz, J., Schneider, A., Cruse, H.: Grounding an internal body model of a hexapod walker - control of curve walking in a biological inspired robot–control of curve walking in a biological inspired robot. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2012, pp. 2762–2768 (2012)
Google Scholar

Download references

Acknowledgments

This research/work was supported by the Cluster of Excellence Cognitive Interaction Technology ‘CITEC’ (EXC 277) at Bielefeld University, which is funded by the German Research Foundation (DFG).

Author information

Authors and Affiliations

Center of Excellence ‘Cognitive Interaction Technology’, Bielefeld University, Bielefeld, Germany
Malte Schilling & Andrew Melnik

Authors

Malte Schilling
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Melnik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malte Schilling .

Editor information

Editors and Affiliations

Department of Cybernetics, National Research Nuclear University “MEPhI”, Moscow, Russia
Alexei V. Samsonovich

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schilling, M., Melnik, A. (2019). An Approach to Hierarchical Deep Reinforcement Learning for a Decentralized Walking Control Architecture. In: Samsonovich, A. (eds) Biologically Inspired Cognitive Architectures 2018. BICA 2018. Advances in Intelligent Systems and Computing, vol 848. Springer, Cham. https://doi.org/10.1007/978-3-319-99316-4_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-99316-4_36
Published: 24 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99315-7
Online ISBN: 978-3-319-99316-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

An Approach to Hierarchical Deep Reinforcement Learning for a Decentralized Walking Control Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Versatile modular neural locomotion control with fast learning

Bio-inspired Decentralized Architecture for Walking of a 5-link Biped Robot with Compliant Knee Joints

Hierarchical Decentralized Deep Reinforcement Learning Architecture for a Simulated Four-Legged Agent

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Approach to Hierarchical Deep Reinforcement Learning for a Decentralized Walking Control Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Versatile modular neural locomotion control with fast learning

Bio-inspired Decentralized Architecture for Walking of a 5-link Biped Robot with Compliant Knee Joints

Hierarchical Decentralized Deep Reinforcement Learning Architecture for a Simulated Four-Legged Agent

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation