×

Adaptive optimal control of affine nonlinear systems via identifier-critic neural network approximation with relaxed PE conditions. (English) Zbl 1530.93204

Summary: This paper considers an optimal control of an affine nonlinear system with unknown system dynamics. A new identifier-critic framework is proposed to solve the optimal control problem. Firstly, a neural network identifier is built to estimate the unknown system dynamics, and a critic NN is constructed to solve the Hamiltonian-Jacobi-Bellman equation associated with the optimal control problem. A dynamic regressor extension and mixing technique is applied to design the weight update laws with relaxed persistence of excitation conditions for the two classes of neural networks. The parameter estimation of the update laws and the stability of the closed-loop system under the adaptive optimal control are analyzed using a Lyapunov function method. Numerical simulation results are presented to demonstrate the effectiveness of the proposed IC learning based optimal control algorithm for the affine nonlinear system.

MSC:

93C40 Adaptive control/observation systems
93C10 Nonlinear systems in control theory
49N90 Applications of optimal control and differential games
Full Text: DOI

References:

[1] Aranovskiy, S.; Bobtsov, A.; Ortega, R.; Pyrkin, A., Performance enhancement of parameter estimators via dynamic regressor extension and mixing, IEEE Transactions on Automatic Control, 62, 7, 3546-3550 (2016) · Zbl 1370.93250
[2] Bhasina, S.; Kamalapurkar, R.; Johnson, M.; Vamvoudakis, K. G.; Lewis, F. L.; Dixon, W. E., A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems, Automatica, 49, 1, 82-92 (2013) · Zbl 1257.93055
[3] Boyd, S.; Sastry, S., Adaptive control: Stability, convergence and robustness (1989), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ, USA · Zbl 0721.93046
[4] Chen, B.; Hu, J.; Zhao, Y.; B. K., Ghosh., Finite-time velocity-free rendezvous control of multiple AUV systems with intermittent communication, IEEE Transactions on Systems, Man, and Cybernetics, 52, 10, 6618-6629 (2022)
[5] Cho, N.; Shin, H.-S.; Kim, Y.; Tsourdos, A., Composite model reference adaptive control with parameter convergence under finite excitation, IEEE Transactions on Automatic Control, 63, 3, 811-818 (2018) · Zbl 1390.93427
[6] Ioannou, P. A.; Sun, J., Robust adaptive control (1996), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ, USA · Zbl 0839.93002
[7] Kamalapurkar, R.; Andrews, L.; Walters, P.; Dixon, W. E., Model-based reinforcement learning for infinite-horizon approximate optimal tracking, IEEE Transactions on Neural Networks and Learning Systems, 28, 3, 753-758 (2017)
[8] Korotina, M.; Romero, J. G.; Aranovskiy, S.; Bobtsov, A.; Ortega, R., A new on-line exponential parameter estimator without persistent excitation, Systems & Control Letters, 159, Article 105079 pp. (2022) · Zbl 1485.93567
[9] Lv, Y.; Na, J.; Yang, Q.; Wu, X.; Guo, Y., Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics, International Journal of Control, 89, 1, 99-112 (2015) · Zbl 1332.93174
[10] Lv, Y.; Wu, Z.; Zhao, X., Data-based optimal microgrid management for energy trading with integral Q-learning scheme, IEEE Internet of Things Journal (2023)
[11] Modares, H.; Lewis, F. L.; Naghibi-Sistani, M. B., Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks, IEEE Transactions on Neural Networks and Learning Systems, 24, 10, 1513-1525 (2013)
[12] Mu, C.; Zhang, Y.; Gao, Z.; Sun, C., ADP-based robust tracking control for a class of nonlinear systems with unmatched uncertainties, IEEE Transactions on Systems, Man, and Cybernetics, 50, 11, 4056-4067 (2019)
[13] Mynuddin, M.; Gao, W., Distributed predictive cruise control based on reinforcement learning and validation on microscopic traffic simulation, IET Intelligent Transport Systems, 14, 5, 270-277 (2020)
[14] Na, J.; Lv, Y.; Zhang, K.; Zhao, J., Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation, IEEE Transactions on Systems, Man, and Cybernetics, 52, 1, 459-472 (2022)
[15] Narayanan, V.; Jagannathan, S., Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration, IEEE Transactions on Cybernetics, 48, 9, 2510-2519 (2018)
[16] Ortega, R.; Aranovskiy, S.; Pyrkin, A.; Astolfi, A.; Bobtsov, A., New results on parameter estimation via dynamic regressor extension and mixing: Continuous and discrete-time cases, IEEE Transactions on Automatic Control, 66, 5, 2265-2272 (2020) · Zbl 1536.93890
[17] Ortega, R.; Nikiforov, V.; Gerasimov, D., On modified parameter estimators for identification and adaptive control. A unified framework and some new schemes, Annual Reviews in Control, 50, 2020, 278-293 (2020)
[18] Pang, B.; Jiang, Z. P.; Mareels, I., Reinforcement learning for adaptive optimal control of continuous-time linear periodic systems, Automatica, 118, Article 109035 pp. (2020) · Zbl 1447.93177
[19] Peng, Z.; Hu, J.; Shi, K.; Luo, R.; Huang, R.; Ghosh, B. K., A novel optimal bipartite consensus control scheme for unknown multi-agent systems via model-free reinforcement learning, Applied Mathematics and Computation, 369, Article 124821 pp. (2020) · Zbl 1433.93008
[20] Peng, Z.; Ji, H.; Zou, C.; Kuang, Y.; Cheng, H.; Shi, K., Optimal \(H_\infty\) tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs, Neural Networks, 164, 105-114 (2023)
[21] Peng, Z.; Luo, R.; Hu, J.; Shi, K.; Nguang, S. K.; Ghosh, B. K., Optimal tracking control of nonlinear multiagent systems using internal reinforce Q-learning, IEEE Transactions on Neural Networks and Learning Systems, 33, 8, 4043-4055 (2022)
[22] Peng, Z.; Yan, W.; Huang, R.; Cheng, H.; Shi, K.; Ghosh, B. K., Event-triggered learning robust tracking control of robotic systems with unknown uncertainties, IEEE Transactions on Circuits and Systems II: Express Briefs, 70, 7, 2540-2544 (2023)
[23] Peng, Z.; Zhao, Y.; Hu, J.; Luo, R.; Ghosh, B. K.; Nguang, S. K., Input-output data-based output antisynchronization control of multi-agent systems using reinforcement learning approach, IEEE Transactions on Industrial Informatics, 17, 11, 7359-7367 (2021)
[24] Song, Y.; Zhao, K.; Krstic, M., Adaptive control with exponential regulation in the absence of persistent excitation, IEEE Transactions on Automatic Control, 62, 5, 2589-2596 (2017) · Zbl 1366.93301
[25] Sun, T.; Sun, X. M., An adaptive dynamic programming scheme for nonlinear optimal control with unknown dynamics and its application to turbofan engines, IEEE Transactions on Industrial Informatics, 17, 1, 367-376 (2021)
[26] Tatari, F.; Vamvoudakis, K. G.; Mazouchi, M., Optimal distributed learning for disturbance rejection in networked non-linear games under unknown dynamics, IET Control Theory & Applications, 13, 17, 2838-2848 (2019)
[27] Vamvoudakis, K. G.; Miranda, M. F.; Hespanha, J. P., Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation, IEEE Transactions on Neural Networks and Learning Systems, 27, 11, 2386-2398 (2016)
[28] Wang, D.; He, H.; Mu, C.; Liu, D., Intelligent critic control with disturbance attenuation for affine dynamics including an application to a microgrid system, IEEE Transactions on Industrial Informatics, 64, 6, 4935-4944 (2017)
[29] Werbos, P., Advanced forecasting methods for global crisis warning and models of intelligence, General System Yearbook, 22, 25-38 (1977)
[30] Werbos, P. J., A menu of designs for reinforcement learning over time (1991), MIT Press: MIT Press Cambridge, MA, USA
[31] Xue, S.; Luo, B.; Liu, D.; Gao, Y., Event-triggered ADP for tracking control of partially unknown constrained uncertain systems, IEEE Transactions on Cybernetics, 52, 9, 9001-9012 (2022)
[32] Xue, S.; Luo, B.; Liu, D.; Gao, Y., Event-triggered integral reinforcement learning for nonzero-sum games with asymmetric input saturation, Neural Networks, 152, 212-223 (2022) · Zbl 07751349
[33] Yang, X.; He, H., Adaptive critic designs for event-triggered robust control of nonlinear systems with unknown dynamics, IEEE Transactions on Cybernetics, 49, 6, 2255-2267 (2019)
[34] Yang, X.; Zhou, Y.; Gao, Z., Reinforcement learning for robust stabilization of nonlinear systems with asymmetric saturating actuators, Neural Networks, 158, 132-141 (2022) · Zbl 1525.93343
[35] Zamfirache, I. A.; Precup, R.-E.; Roman, R.-C.; Petriu, E. M., Reinforcement learning-based control using Q-learning and gravitational search algorithm with experimental validation on a nonlinear servo system, Information Sciences, 583, 99-120 (2022) · Zbl 1532.93132
[36] Zhao, B.; Luo, F.; Lin, H.; Liu, D., Particle swarm optimized neural networks based local tracking control scheme of unknown nonlinear interconnected systems, Neural Networks, 134, 54-63 (2021)
[37] Zhao, D.; Zhang, Q.; Wang, D.; Zhu, Y., Experience replay for optimal control of nonzero-sum game systems with unknown dynamics, IEEE Transactions on Cybernetics, 46, 3, 854-865 (2016)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.