×

Implicit contact dynamics modeling with explicit inertia matrix representation for real-time, model-based control in physical environment. (English) Zbl 1483.93146

Summary: Model-based control has great potential for use in real robots due to its high sampling efficiency. Nevertheless, dealing with physical contacts and generating accurate motions are inevitable for practical robot control tasks, such as precise manipulation. For a real-time, model-based approach, the difficulty of contact-rich tasks that requires precise movement lies in the fact that a model needs to accurately predict forthcoming contact events within a limited length of time rather than detect them afterward with sensors. Therefore, in this study, we investigate whether and how neural network models can learn a task-related model useful enough for model-based control, that is, a model predicting future states, including contact events. To this end, we propose a structured neural network model predictive control (SNN-MPC) method, whose neural network architecture is designed with explicit inertia matrix representation. To train the proposed network, we develop a two-stage modeling procedure for contact-rich dynamics from a limited number of samples. As a contact-rich task, we take up a trackball manipulation task using a physical 3-DoF finger robot. The results showed that the SNN-MPC outperformed MPC with a conventional fully connected network model on the manipulation task.

MSC:

93B45 Model predictive control
93C85 Automated systems (robots, etc.) in control theory
68T05 Learning and adaptive systems in artificial intelligence

Software:

PILCO; Chainer

References:

[1] Agrawal, A., Amos, B., Barratt, S. T., Boyd, S. P., Diamond, S., & Kolter, J. Z. (2019). Differentiable convex optimization layers. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems, 32 (pp. 9558-9570). Red Hook, NY: Curran.
[2] Carpentier, J., & Mansard, N. (2018). Analytical derivatives of rigid body dynamics algorithms. In Proceedings of the Robotics: Science and Systems XIV.
[3] Deisenroth, M. P., Fox, D., & Rasmussen, C. E. (2015). Gaussian processes for data-efficient learning in robotics and control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2), 408-423. []
[4] Deisenroth, M., & Rasmussen, C. (2011). PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on Machine Learning (pp. 465-472). Madison, WI: Omnipress.
[5] Díaz Ledezma, F., & Haddadin, S. (2017). First-order-principles-based constructive network topologies: An application to robot inverse dynamics. In Proceedings of the IEEE-RAS 17th International Conference on Humanoid Robotics (pp. 438-445). Piscataway, NJ: IEEE.
[6] Erez, T., Lowrey, K., Tassa, Y., Kumar, V., Kolev, S., & Todorov, E. (2013). An integrated system for real-time model predictive control of humanoid robots. In Proceedings of the 13th IEEE-RAS International Conference on Humanoid Robots (pp. 292-299). Piscataway, NJ: IEEE.
[7] Fazeli, N., Kolbert, R., Tedrake, R., & Rodriguez, A. (2017). Parameter and contact force estimation of planar rigid-bodies undergoing frictional contact. International Journal of Robotics Research, 36(13-14), 1437-1454.
[8] Finn, C., & Levine, S. (2017). Deep visual foresight for planning robot motion. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 2786-2793). Piscataway, NJ: IEEE.
[9] Finzi, M., Wang, K. A., & Wilson, A. G. (2020). Simplifying Hamiltonian and Lagrangian neural networks via explicit constraints. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in neural information processing systems, 33 (pp. 13880-13889). Red Hook, NY: Curran.
[10] Gamboa Higuera, J. C., Meger, D., & Dudek, G. (2018). Synthesizing neural network controllers with probabilistic model-based reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 2538-2544). Piscataway, NJ: IEEE.
[11] Greydanus, S., Dzamba, M., & Yosinski, J. (2019). Hamiltonian neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems, 32 (pp. 15353-15363). Red Hook, NY: Curran.
[12] Gu, S., Holly, E., Lillicrap, T., & Levine, S. (2017). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 3389-3396). Piscataway, NJ: IEEE.
[13] Gupta, J. K., Menda, K., Manchester, Z., & Kochenderfer, M. J. (2020). Structured mechanical models for robot learning and control. In Proceedings of the Second Annual Conference on Learning for Dynamics and Control (pp. 328-337).
[14] Lenz, I., Knepper, R., & Saxena, A. (2015). DeepMPC: Learning deep latent features for model predictive control. In Proceedings of the Robotics: Science and Systems XI.
[15] Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., & Quillen, D. (2018). Learning handeye coordination for robotic grasping with deep learning and large-scale data collection. International Journal of Robotics Research, 37(4-5), 421-436.
[16] Li, W., & Todorov, E. (2004). Iterative linear quadratic regulator design for nonlinear biological movement systems. In Proceedings of the First International Conference on Informatics in Control, Automation and Robotics (pp. 222-229). Setúbal, Portugal: SciTePress.
[17] Lutter, M., Listmann, K., & Peters, J. (2019). Deep Lagrangian networks for end-to-end learning of energy-based control for under-actuated systems. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 7718-7725). Piscataway, NJ: IEEE.
[18] Nagabandi, A., Kahn, G., Fearing, R. S., & Levine, S. (2018). Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In Proceedings of the IEEE International Conference on Robotics and Automation. Piscataway, NJ: IEEE.
[19] Pan, Y., & Theodorou, E. (2014). Probabilistic differential dynamic programming. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27 (pp. 1907-1915). Red Hook, NY: Curran.
[20] Tassa, Y., Erez, T., & Smart, W. D. (2008). Receding horizon differential dynamic programming. In J. Platt, D. Koller, Y. Singer, & S. Roweis (Eds.), Advances in neural information processing systems, 20 (pp. 1465-1472). Red Hook, NY: Curran.
[21] Tassa, Y., Erez, T., & Todorov, E. (2012). Synthesis and stabilization of complex behaviors through online trajectory optimization. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 4906-4913). Piscataway, NJ: IEEE.
[22] Tokui, S., Oono, K., Hido, S., & Clayton, J. (2015). Chainer: A next-generation open source framework for deep learning. In Proceedings of the Workshop on Machine Learning Systems in the 29th Annual Conference on Neural Information Processing Systems.
[23] Zhang, M., Vikram, S., Smith, L., Abbeel, P., Johnson, M., & Levine, S. (2019). SOLAR: Deep structured representations for model-based reinforcement learning. In Proceedings of the 36th International Conference on Machine Learning (pp. 7444-7453).
[24] Zhong, Y. D., Dey, B., & Chakraborty, A. (2021). A differentiable contact model to extend Lagrangian and Hamiltonian neural networks for modeling hybrid dynamics. arXiv:2102.06794.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.