×

Adaptive inverse optimal control for rehabilitation robot systems using actor-critic algorithm. (English) Zbl 1407.93265

Summary: The higher goal of rehabilitation robot is to aid a person to achieve a desired functional task (e.g., tracking trajectory) based on assisted-as-needed principle. To this goal, a new adaptive inverse optimal hybrid control (AHC) combining inverse optimal control and actor-critic learning is proposed. Specifically, an uncertain nonlinear rehabilitation robot model is firstly developed that includes human motor behavior dynamics. Then, based on this model, an open-loop error system is formed; thereafter, an inverse optimal control input is designed to minimize the cost functional and a NN-based actor-critic feedforward signal is responsible for the nonlinear dynamic part contaminated by uncertainties. Finally, the AHC controller is proven (through a Lyapunov-based stability analysis) to yield a global uniformly ultimately bounded stability result, and the resulting cost functional is meaningful. Simulation and experiment on rehabilitation robot demonstrate the effectiveness of the proposed control scheme.

MSC:

93C85 Automated systems (robots, etc.) in control theory
49N35 Optimal feedback synthesis
Full Text: DOI

References:

[1] Eschweiler, J.; Gerlach-Hahn, K.; Jansen-Toy, A., A survey on robotic devices for upper limb rehabilitation, Journal of Neuroengineering and Rehabilitation, 11, 1, article 3 (2014)
[2] Ziherl, J.; Novak, D.; Olenšek, A.; Mihelj, M.; Munih, M., Evaluation of upper extremity robot-assistances in subacute and chronic stroke subjects, Journal of NeuroEngineering and Rehabilitation, 7, 1, article 52 (2010) · doi:10.1186/1743-0003-7-52
[3] Hayward, K.; Barker, R.; Brauer, S., Interventions to promote upper limb recovery in stroke survivors with severe paresis: a systematic review, Disability and Rehabilitation, 32, 24, 1973-1986 (2010) · doi:10.3109/09638288.2010.481027
[4] Lewis, G. N.; Perreault, E. J., An assessment of robot-assisted bimanual movements on upper limb motor coordination following stroke, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 17, 6, 595-604 (2009) · doi:10.1109/TNSRE.2009.2029315
[5] Krebs, H. I.; Palazzolo, J. J.; Dipietro, L.; Ferraro, M.; Krol, J.; Rannekleiv, K.; Volpe, B. T.; Hogan, N., Rehabilitation robotics: performance-based progressive robot-assisted therapy, Autonomous Robots, 15, 1, 7-20 (2003) · doi:10.1023/A:1024494031121
[6] Loureiro, R.; Amirabdollahian, F.; Topping, M.; Driessen, B.; Harwin, W., Upper limb robot mediated stroke therapy—GENTLE/s approach, Autonomous Robots, 15, 1, 35-51 (2003) · doi:10.1023/A:1024436732030
[7] Lum, P. S.; Burgar, C. G.; Van Der Loos, M.; Shor, P. C.; Majmundar, M.; Yap, R., MIME robotic device for upper-limb neurorehabilitation in subacute stroke subjects: a follow-up study, Journal of Rehabilitation Research and Development, 43, 5, 631-642 (2006) · doi:10.1682/JRRD.2005.02.0044
[8] Kahn, L. E.; Lum, P. S.; Rymer, W. Z.; Reinkensmeyer, D. J., Robot-assisted movement training for the stroke-impaired arm: does it matter what the robot does?, Journal of Rehabilitation Research and Development, 43, 5, 619-630 (2006) · doi:10.1682/JRRD.2005.03.0056
[9] Kwakkel, G.; Kollen, B. J.; Krebs, H. I., Effects of robot-assisted therapy on upper limb recovery after stroke: a systematic review, Neurorehabilitation and Neural Repair, 22, 2, 111-121 (2008) · doi:10.1177/1545968307305457
[10] Colombo, R.; Pisano, F.; Micera, S.; Mazzone, A.; Delconte, C.; Chiara Carrozza, M.; Dario, P.; Minuco, G., Robotic techniques for upper limb evaluation and rehabilitation of stroke patients, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13, 3, 311-324 (2005) · doi:10.1109/TNSRE.2005.848352
[11] Tsuji, T.; Tanaka, Y., Tracking control properties of human-robotic systems based on impedance control, IEEE Transactions on Systems, Man, and Cybernetics A:Systems and Humans, 35, 4, 523-535 (2005) · doi:10.1109/TSMCA.2005.850603
[12] Choi, Y.; Gordon, J.; Kim, D.; Schweighofer, N., An adaptive automated robotic task-practice system for rehabilitation of arm functions after stroke, IEEE Transactions on Robotics, 25, 3, 556-568 (2009) · doi:10.1109/TRO.2009.2019787
[13] Meng, F.; Dai, Y.; Jin, Y.; Wang, Y., The design of a new upper limb rehabilitation robot system based on multi-source data fusion, Proceedings of the 10th World Congress on Intelligent Control and Automation (WCICA ’12) · doi:10.1109/WCICA.2012.6359113
[14] Choi, Y.; Gordon, J.; Park, H.; Schweighofer, N., Feasibility of the adaptive and automatic presentation of tasks (ADAPT) system for rehabilitation of upper extremity function post-stroke, Journal of NeuroEngineering and Rehabilitation, 8, 1, article 42 (2011) · doi:10.1186/1743-0003-8-42
[15] Duff, M.; Chen, Y.; Attygalle, S., An adaptive mixed reality training system for stroke rehabilitation, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 18, 5, 531-541 (2010) · doi:10.1109/TNSRE.2010.2055061
[16] Chen, Y.; Duff, M.; Lehrer, N.; Liu, S.; Blake, P.; Wolf, S. L.; Sundaram, H.; Rikakis, T., A novel adaptive mixed reality system for stroke rehabilitation: principles, proof of concept, and preliminary application in 2 patients, Topics in Stroke Rehabilitation, 18, 3, 212-230 (2011) · doi:10.1310/tsr1803-212
[17] Erdogan, A.; Satici, A. C.; Patoglu, V., Passive velocity field control of a forearm-wrist rehabilitation robot, Proceedings of the IEEE International Conference on Rehabilitation Robotics (ICORR ’11) · doi:10.1109/ICORR.2011.5975433
[18] Jezernik, S.; Wassink, R. G. V.; Keller, T., Sliding mode closed-loop control of FES controlling the shank movement, IEEE Transactions on Biomedical Engineering, 51, 2, 263-272 (2004) · doi:10.1109/TBME.2003.820393
[19] Ajoudani, A.; Erfanian, A., A neuro-sliding-mode control with adaptive modeling of uncertainty for control of movement in paralyzed limbs using functional electrical stimulation, IEEE Transactions on Biomedical Engineering, 56, 7, 1771-1780 (2009) · doi:10.1109/TBME.2009.2017030
[20] Chen, C. S., Dynamic structure neural-fuzzy networks for robust adaptive control of robot manipulators, IEEE Transactions on Industrial Electronics, 55, 9, 3402-3414 (2008) · doi:10.1109/TIE.2008.926778
[21] Kiguchi, K.; Tanaka, T.; Fukuda, T., Neuro-fuzzy control of a robotic exoskeleton with EMG signals, IEEE Transactions on Fuzzy Systems, 12, 4, 481-490 (2004) · doi:10.1109/TFUZZ.2004.832525
[22] Sharma, N.; Gregory, C. M.; Johnson, M.; Dixon, W. E., Closed-loop neural network-based NMES control for human limb tracking, IEEE Transactions on Control Systems Technology, 20, 3, 712-725 (2012) · doi:10.1109/TCST.2011.2125792
[23] Arya, K. N.; Pandian, S.; Verma, R.; Garg, R. K., Movement therapy induced neural reorganization and motor recovery in stroke: a review, Journal of Bodywork and Movement Therapies, 15, 4, 528-537 (2011) · doi:10.1016/j.jbmt.2011.01.023
[24] Davoodi, R.; Andrews, B. J., Computer simulation of FES standing up in paraplegia: a self-adaptive fuzzy controller with reinforcement learning, IEEE Transactions on Rehabilitation Engineering, 6, 2, 151-161 (1998) · doi:10.1109/86.681180
[25] Izawa, J.; Kondo, T.; Ito, K., Biological arm motion through reinforcement learning, Biological Cybernetics, 91, 1, 10-22 (2004) · Zbl 1059.92004 · doi:10.1007/s00422-004-0485-3
[26] Kartoun, U.; Stern, H.; Edan, Y., A human-robot collaborative reinforcement learning algorithm, Journal of Intelligent and Robotic Systems, 60, 2, 217-239 (2010) · Zbl 1203.68242 · doi:10.1007/s10846-010-9422-y
[27] Wang, S.; Chaovalitwongse, W.; Babuška, R., Machine learning algorithms in bipedal robot control, IEEE Transactions on Systems, Man and Cybernetics C: Applications and Reviews, 42, 5, 728-743 (2012) · doi:10.1109/TSMCC.2012.2186565
[28] Tamei, T.; Shibata, T., Fast reinforcement learning for three-dimensional kinetic human-robot cooperation with an EMG-to-activation model, Advanced Robotics, 25, 5, 563-580 (2011) · doi:10.1163/016918611X558252
[29] Galan, G.; Jagannathan, S., Adaptive critic-based neural network object contact controller for a three-finger gripper, Proceedings of the IEEE International Symposium on Intelligent Control, (ISIC ’01)
[30] Kim, B.; Park, J.; Park, S.; Kang, S., Impedance learning for robotic contact tasks using natural actor-critic algorithm, IEEE Transactions on Systems, Man, and Cybernetics B: Cybernetics, 40, 2, 433-443 (2010) · doi:10.1109/TSMCB.2009.2026289
[31] Thomas, P. S.; Branicky, M.; van den Bogert, A.; Jagodnik, K., Creating a reinforcement learning controller for functional electrical stimulation of a human arm, Proceedings of the Yale Workshop on Adaptive and Learning Systems
[32] Pilarski, P. M.; Dawson, M. R.; Degris, T.; Fahimi, F.; Carey, J. P.; Sutton, R. S., Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning, Proceedings of the IEEE International Conference on Rehabilitation Robotics (ICORR ’11) · doi:10.1109/ICORR.2011.5975338
[33] Bhasin, S.; Johnson, M.; Dixon, W. E., A model-free robust policy iteration algorithm for optimal control of nonlinear systems, Proceedings of the 49th IEEE Conference on Decision and Control (CDC ’10), IEEE · doi:10.1109/CDC.2010.5717295
[34] Vamvoudakis, K. G.; Lewis, F. L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 46, 5, 878-888 (2010) · Zbl 1191.49038 · doi:10.1016/j.automatica.2010.02.018
[35] Bhasin, S.; Kamalapurkar, R.; Johnson, M.; Vamvoudakis, K. G.; Lewis, F. L.; Dixon, W. E., A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems, Automatica, 49, 1, 82-92 (2013) · Zbl 1257.93055 · doi:10.1016/j.automatica.2012.09.019
[36] Li, Y.; Ge, S. S., Human—robot collaboration based on motion intention estimation, IEEE/ASME Transactions on Mechatronics, 19, 3, 1007-1014 (2013) · doi:10.1109/TMECH.2013.2264533
[37] Newman, W. S., Stability and performance limits of interaction controllers, Journal of Dynamic Systems, Measurement and Control, Transactions of the ASME, 114, 4, 563-570 (1992) · Zbl 0825.93072 · doi:10.1115/1.2897725
[38] Rahman, M.; Ikeura, R.; Mizutani, K., Investigation of the impedance characteristic of human arm for development of robots to cooperate with humans, JSME International Journal C: Mechanical Systems, Machine Elements and Manufacturing, 45, 2, 510-518 (2002)
[39] Wolbrecht, E. T.; Chan, V.; Reinkensmeyer, D. J.; Bobrow, J. E., Optimizing compliant, model-based robotic assistance to promote neurorehabilitation, IEEE Transactions on Neural Systems and Rehabilitation Engineering, 16, 3, 286-297 (2008) · doi:10.1109/TNSRE.2008.918389
[40] Freeman, C. T.; Rogers, E.; Hughes, A. a. .; Meadmore, K. L., Iterative learning control in health care: electrical stimulation and robotic-assisted upper-limb stroke rehabilitation, IEEE Control Systems Magazine, 32, 1, 18-43 (2012) · Zbl 1395.93388 · doi:10.1109/MCS.2011.2173261
[41] Lewis, F. L.; Selmic, R.; Campos, J., Neuro-Fuzzy Control of Industrial Systems with Actuator Nonlinearities (2002), Philadelphia, Pa, USA: Society for Industrial and Applied Mathematics, Philadelphia, Pa, USA · Zbl 1010.93001 · doi:10.1137/1.9780898717563
[42] Jiang, Y.; Jiang, Z.-P., Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 48, 10, 2699-2704 (2012) · Zbl 1271.93088 · doi:10.1016/j.automatica.2012.06.096
[43] Jiang, Y.; Jiang, Z., Robust approximate dynamic programming and global stabilization with nonlinear dynamic uncertainties, Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC ’11) · doi:10.1109/CDC.2011.6160279
[44] Puga-Guzmán, S.; Moreno-Valenzuela, J.; Santibáñez, V., Adaptive neural network motion control of manipulators with experimental evaluations, The Scientific World Journal, 2014 (2014) · doi:10.1155/2014/694706
[45] Bhasin, S.; Sharma, N.; Patre, P.; Dixon, W., Asymptotic tracking by a reinforcement learning-based adaptive critic controller, Journal of Control Theory and Applications, 9, 3, 400-409 (2011) · doi:10.1007/s11768-011-0170-8
[46] Campos, J.; Lewis, F. L., Adaptive critic neural network for feedforward compensation, Proceedings of the American Control Conference (ACC ’99)
[47] Kuljaca, O.; Lewis, F. L., Adaptive critic design using non-linear network structures, International Journal of Adaptive Control and Signal Processing, 17, 6, 431-445 (2003) · Zbl 1046.93508 · doi:10.1002/acs.760
[48] Lewis, F. L.; Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine, 9, 3, 32-50 (2009) · doi:10.1109/MCAS.2009.933854
[49] Krstic, M.; Tsiotras, P., Inverse optimal stabilization of a rigid spacecraft, IEEE Transactions on Automatic Control, 44, 5, 1042-1049 (1999) · Zbl 1136.93424 · doi:10.1109/9.763225
[50] Hu, J.; Dawson, D. M.; Qian, Y., Position tracking control for robot manipulators driven by induction motors without flux measurements, IEEE Transactions on Robotics and Automation, 12, 3, 419-438 (1996) · doi:10.1109/70.499824
[51] Kung, P. C.; Lin, C. C. K.; Ju, M. S., Neuro-rehabilitation robot-assisted assessments of synergy patterns of forearm, elbow and shoulder joints in chronic stroke patients, Clinical Biomechanics, 25, 7, 647-654 (2010) · doi:10.1016/j.clinbiomech.2010.04.014
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.