Abstract
Although learning of control policies from demonstrations has been thoroughly investigated in the literature, generalization of policies to new contexts still remains a challenge given that existing approaches exhibit limited performance when generalizing to new tasks. In this article, we propose two policy generalization approaches employed for generalizing motion-based force control policies with the view of performing constrained motions in presence of motion-dependent external forces. The key concept of the proposed methods is using, apart from policy values, also policy derivatives or differences which express how the policy varies with respect to variations in its input and combine these two kinds of information to generalize the policy at new inputs. The first proposed approach learns policy and policy derivative values by linear regression and combines these data into a first-order Taylor-like polynomial to estimate the policy at new inputs. The second approach learns policy and policy difference data by locally weighted regression and combines them in a superposition fashion to estimate the policy at new inputs. The policy differences in this approach represent variations of the policy in the direction of minimizing the distance between the new incoming and average-demonstrated inputs. The proposed approaches are evaluated in real-world robot constrained motion tasks by using a linear-actuated, two degrees-of-freedom haptic device.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Burdet, E., Tee, K.P., Mareels, I., Milner, T.E., Chew, C.M., Franklin, D.W., Osu, R., Kawato, M.: Stability and motor adaptation in human arm motions. Biol. Cybern. 94, 20–32 (2006)
Burdet, E., Osu, R., Franklin, D.W., Milner, T.E., Kawato, M.: The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414, 446–449 (2001)
Kawato, M.: Internal models for motor control and trajectory planning. Curr. Opin. Neurobiol. 9, 718–727 (1999)
Nguyen-Tuong, D., Peters, J., Seeger, M., Schölkopf B.: Learning Inverse Dynamics: a comparison. Eur. Symp. Artif. Neural Netw. (ESANN), 13–18 (2008)
Sun de la Cruz, J., Kulić, D., Owen, W.: Online incremental learning of inverse dynamics incorporating prior knowledge. Autonomous and intelligent systems (2011)
Nguyen-Tuong, D., Peters, J.: Using model knowledge for learning inverse dynamics. IEEE international conference on robotics and automation (2010)
Rozo, L., Jiménez, P., Torras, C.: Sharpening haptic inputs for teaching a manipulation skill to a robot. In: International Conference on Applied Bionics and Biomechanics, pp. 370–377 (2010)
Rozo, L., Jiménez, P., Torras, C.: A robot learning from demonstration framework to perform force-based manipulation tasks. Intel. Serv. Robotics 6(1), 33–51 (2013)
Kormushev, P., Calinon, S., Caldwell, D.G.: Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Adv. Robot. 25(5), 581–603 (2011)
Rozo, L., Calinon, S., Caldwell, D.G.: Learning force and position constraints in human-robot cooperative transportation. In: Proceedings of IEEE International Symposium on Robot and Human Interactive Communication (Ro-Man) (2014)
Gams, A., Nemec, B., Ijspeert, A.J., Ude, A.: Coupling movement primitives: interaction with the environment and bimanual tasks. IEEE Trans. Robot. 30(4), 816–830 (2014)
Ijspeert, A., Nakanishi, J., Schaal, S.: Learning attractor landscapes for learning motor primitives. In: Advances in Neural Information Processing Systems 15, pp. 1547–1554. MIT Press, Cambridge (2003)
Koropouli, V., Lee, D., Hirche, S.: Learning interaction control policies by demonstration. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 344–349 (2011)
Righetti, L., Kalakrishnan, M., Pastor, P., Binney, J., Kelly, J., Voorhies, R., Sukhatme, G., Schaal, S.: An autonomous manipulation system based on force control and optimization. Autonomous Robots, Special Issue: Autonomous Grasping and Manipulation 36(1-2), 11–30 (2014)
Schmidts, A., Lee, D., Peer, A.: Imitation learning of human grasping skills from motion and force data. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1002–1007 (2011)
Buchli, J., Stulp, F., Theodorou, E., Schaal, S.: Learning variable impedance control. Int. J. Robot. Res., 820–833 (2011)
Stulp, F., Buchli, J., Ellmer, A., Mistry, M., Theodorou, E., Schaal, S.: Reinforcement learning of impedance control in stochastic force fields. In: IEEE International Conference on Development and Learning (ICDL), vol. 2, pp. 1–6 (2011)
Lee, D., Ott, C.: Incremental kinesthetic teaching of motion primitives using the motion refinement tube. Auton. Robot. 31(2), 115–131 (2011)
Lee, D., Ott, C.: Incremental motion primitive learning by physical coaching using impedance control. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2010)
Ganesh, G., Jarrasse, N., Haddadin, S., Albu-Schaeffer, A., Burdet, E.: A versatile biomimetic controller for contact tooling and haptic exploration. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3329–3334 (2012)
Rozo, L., Calinon, S., Caldwell, D.G., Jimenez, P., Torras, C.: Learning collaborative impedance-based robot behaviors. In: AAAI Conference on Artificial Intelligence, pp. 1422–1428 (2013)
Lee, D., Ott, C., Nakamura, Y.: Mimetic communication with impedance control for physical human-robot interaction. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 1535–1542 (2009)
Medina, J.R., Lorenz, T., Lee, D., Hirche, S.: Adaptive risk-sensitive optimal feedback control for haptic assistance. In: Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3639–3645 (2012)
Schaal, S., Atkeson, C.: Learning Control in Robotics. IEEE Robot. Autom. Mag. 17(2), 20–29 (2010)
Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.: Learning force control policies for compliant manipulation Intelligent Robots and Systems (IROS), pp. 4639–4644 (2011)
Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: International Conference on Robotics and Automation (2011)
Deisenroth, M.P., Fox, D., Rasmussen, C.E.: Gaussian processes for data-efficient learning in robotics and control. IEEE Transactions on Pattern Analysis and Machine Intelligence (2014)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Schaal, S., Atkeson, C.G.: Constructive incremental learning from only local information. Neural Comput. 10, 2047–2084 (1997)
Vijayakumar, S., Schaal, S.: Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional spaces. In: Proceedings of the 17th International Conference on Machine Learning, vol. 1, pp. 288–293 (2000)
Solak, E., Murray-Smith, R., Leithead, W.E., Leith, D.J., Rasmussen, C.E.: Derivative observations in Gaussian process models of dynamic systems. In: Advances in Neural Information Processing Systems 15, pp. 1033–1040. MIT press, Cambridge (2003)
Kocijan, J., Leith, D.J.: Derivative observations used in predictive control. In: Proceedings of Melecon, vol. 1, pp. 379–382 (2004)
Koropouli, V., Hirche, S., Lee, D.: Learning and generalizing force control policies for sculpting. Intelligent Robots and Systems (IROS), pp. 1493–1498 (2012)
Howard, M., Klanke, S., Gienger, M., Goerick, C., Vijayakumar, S.: A novel method for learning policies from variable constraint data. Auton. Robot. 27(2), 105–121 (2009)
Calinon, S., Guenter, F., Billard, A.: On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics. Part B. Cybernetics: A Publication of the IEEE Systems, Man, and Cybernetics Society 37(2), 286–298 (2007)
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)
Hahn, J.G.: The hazards of extrapolation in regression analysis. J. Qual. Technol. 9(4) (1997)
Sahai, R., Griffith, S., Stoytchev, A.: Interactive identification of writing instruments and writable surfaces by a robot. In: Proceedings of Robotics Science and Systems (RSS), Workshop: Mobile Manipulation in Human Environments (2009)
Owen, W., Croft, E., Benhabib, B.: Stiffness optimization for two-armed robotic sculpting. Ind. Robot. Int. J. 35(1), 46–57 (2008)
Kazanzides, P., Zuhars, J., Mittelstadt, B.D., Taylor, R.H.: Force sensing and control for a surgical robot. In: IEEE International Conference on Robotics and Automation, pp. 612–617 (1992)
Koropouli, V., Gusrialdi, A., Lee, D.: ESC-MRAC of MIMO systems for constrained robotic motion tasks in deformable environments. In: European Control Conference (ECC), pp. 2109–2114 (2014)
Zuhars, J., Hsia, T.C.: Nonhomogeneous material milling using a robot manipulator with force controlled velocity. In: IEEE International Conference on Robotics and Automation, vol. 2, pp. 1461–1467 (1995)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koropouli, V., Hirche, S. & Lee, D. Generalization of Force Control Policies from Demonstrations for Constrained Robotic Motion Tasks. J Intell Robot Syst 80 (Suppl 1), 133–148 (2015). https://doi.org/10.1007/s10846-015-0218-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-015-0218-y