Document Zbl 1536.93835

Zhu, Jiahao; Wen, Guoxing; Veluvolu, Kalyana C.

Optimized backstepping consensus control using adaptive observer-critic-actor reinforcement learning for strict-feedback multi-agent systems. (English) Zbl 1536.93835

J. Franklin Inst. 361, No. 6, Article ID 106693, 23 p. (2024).

Summary: In this paper, we present a novel control method, combining observer-based optimized backstepping (OB) control, reinforcement learning (RL) strategy, and adaptive neural networks (NN), for strict-feedback multi-agent systems with unmeasurable states. The primary objective is to enhance the overall system backstepping controller by optimizing both virtual and actual controllers for corresponding subsystems. To achieve this, we develop an observer-critic-actor RL approach based on NN approximation in all backstepping steps. The observers are utilized to estimate the unmeasurable system states, while the critic-actor algorithm assesses the control performance and executes control actions. Our optimized control method offers advantages such as not requiring the state observer to satisfy the Hurwitz equation. Additionally, our designed RL algorithm is simple due to the critic-actor adaptive laws obtained by means of the negative gradient of a simple positive function associated with the partial derivative of Hamilton-Jacobi-Bellman (HJB) equation. Consequently, our proposed control method ensures that all error states of the multi-agent systems are semi-globally uniformly ultimately bounded (SGUUB), and that all outputs can synchronously follow the reference signal with desired accuracy. Finally, we demonstrate the efficacy of our control strategy by means of theoretical analysis and simulation results.

MSC:

93D50	Consensus
93B52	Feedback control
93A16	Multi-agent systems
93C40	Adaptive control/observation systems
93B53	Observers

Keywords:

optimized control; RL; state observer; critic-actor; NN; strict-feedback multi-agent systems

Cite Review PDF

Full Text: DOI

References:

[1]	Bryson, A. E., Optimal control-1950 to 1985, IEEE Control Syst. Mag., 16, 3, 26-33, 1996
[2]	Asher, Z. D.; Baker, D. A.; Bradley, T. H., Prediction error applied to hybrid electric vehicle optimal fuel economy, IEEE Trans. Control Syst. Technol., 26, 6, 2121-2134, 2018
[3]	Wang, W.; Liu, K.; Yang, C.; Xu, B.; Ma, M., Cyber physical energy optimization control design for PHEVs based on enhanced firework algorithm, IEEE Trans. Veh. Technol., 70, 1, 282-291, 2020
[4]	Chai, R.; Tsourdos, A.; Savvaris, A.; Chai, S.; Xia, Y.; Chen, C. P., Six-DOF spacecraft optimal trajectory planning and real-time attitude control: A deep neural network-based approach, IEEE Trans. Neural Netw. Learn. Syst., 31, 11, 5005-5013, 2019
[5]	Wen, G.; Hao, W.; Feng, W.; Gao, K., Optimized backstepping tracking control using reinforcement learning for quadrotor unmanned aerial vehicle system, IEEE Trans. Syst. Man Cybern.: Syst., 52, 8, 5004-5015, 2022
[6]	Guo, K.; Hu, Y.; Qian, Z.; Liu, H.; Zhang, K.; Sun, Y.; Gao, J.; Yin, B., Optimized graph convolution recurrent neural network for traffic prediction, IEEE Trans. Intell. Transp. Syst., 22, 2, 1138-1149, 2020
[7]	Lee, H.; Lee, S. H.; Quek, T. Q., Deep learning for distributed optimization: Applications to wireless resource management, IEEE J. Sel. Areas Commun., 37, 10, 2251-2266, 2019
[8]	Tang, W.; Zhang, Y. P.; Zhou, X. Y., Exploratory HJB equations and their convergence, SIAM J. Control Optim., 60, 6, 3191-3216, 2022 · Zbl 1501.35132
[9]	Li, J.; Cui, H., Optimal trajectory exploration large-scale deep reinforcement learning tuned optimal controller for proton exchange membrane fuel cell, J. Franklin Inst. B, 359, 15, 8107-8126, 2022 · Zbl 1497.93073
[10]	Wen, G.; Chen, C. P.; Li, W. N., Simplified optimized control using reinforcement learning algorithm for a class of stochastic nonlinear systems, Inform. Sci., 517, 230-243, 2020 · Zbl 1461.93555
[11]	Heydari, A., Optimal scheduling for reference tracking or state regulation using reinforcement learning, J. Franklin Inst. B, 352, 8, 3285-3303, 2015 · Zbl 1395.93275
[12]	Liu, T.; Hu, X.; Li, S. E.; Cao, D., Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle, IEEE/ASME Trans. Mechatronics, 22, 4, 1497-1507, 2017
[13]	Li, Y.; Fan, Y.; Li, K.; Liu, W.; Tong, S., Adaptive optimized backstepping control-based RL algorithm for stochastic nonlinear systems with state constraints and its application, IEEE Trans. Cybern., 52, 10, 10542-10555, 2021
[14]	Dominguez, R.; Cannella, S., Insights on multi-agent systems applications for supply chain management, Sustainability, 12, 5, 1935, 2020
[15]	Wu, J.; Yuan, S.; Ji, S.; Zhou, G.; Wang, Y.; Wang, Z., Multi-agent system design and evaluation for collaborative wireless sensor network in large structure health monitoring, Expert Syst. Appl., 37, 3, 2028-2036, 2010
[16]	Coelho, V. N.; Cohen, M. W.; Coelho, I. M.; Liu, N.; Guimarães, F. G., Multi-agent systems applied for energy systems integration: State-of-the-art applications and trends in microgrids, Appl. Energy, 187, 820-832, 2017
[17]	He, W.; Cao, J., Consensus control for high-order multi-agent systems, IET Control Theory Appl., 5, 1, 231-238, 2011
[18]	Gao, R.; Huang, J.; Wang, L., Leaderless consensus control of uncertain multi-agents systems with sensor and actuator attacks, Inform. Sci., 505, 144-156, 2019 · Zbl 1460.93090
[19]	Ding, L.; Han, Q.-L.; Guo, G., Network-based leader-following consensus for distributed multi-agent systems, Automatica, 49, 7, 2281-2286, 2013 · Zbl 1364.93014
[20]	Cui, Y.; Liu, X.; Deng, X.; Wen, G., Command-filter-based adaptive finite-time consensus control for nonlinear strict-feedback multi-agent systems with dynamic leader, Inform. Sci., 565, 17-31, 2021 · Zbl 1530.93510
[21]	Shen, Q.; Shi, P., Distributed command filtered backstepping consensus tracking control of nonlinear multiple-agent systems in strict-feedback form, Automatica, 53, 120-124, 2015 · Zbl 1371.93019
[22]	Movric, K. H.; Lewis, F. L., Cooperative optimal control for multi-agent systems on directed graph topologies, IEEE Trans. Automat. Control, 59, 3, 769-774, 2013 · Zbl 1360.49026
[23]	Zhang, Z.; Zhang, S.; Li, H.; Yan, W., Cooperative robust optimal control of uncertain multi-agent systems, J. Franklin Inst. B, 357, 14, 9467-9483, 2020 · Zbl 1448.93069
[24]	Wen, G.; Chen, C. L.P., Optimized backstepping consensus control using reinforcement learning for a class of nonlinear strict-feedback-dynamic multi-agent systems, IEEE Trans. Neural Netw. Learn. Syst., 34, 3, 1524-1536, 2023
[25]	Lin, Z.; Liu, Z.; Zhang, Y.; Chen, C. P., Adaptive neural inverse optimal tracking control for uncertain multi-agent systems, Inform. Sci., 584, 31-49, 2022 · Zbl 1532.93172
[26]	Li, H.; Wu, C.; Yin, S.; Lam, H.-K., Observer-based fuzzy control for nonlinear networked systems under unmeasurable premise variables, IEEE Trans. Fuzzy Syst., 24, 5, 1233-1245, 2016
[27]	Li, Y.; Xu, N.; Niu, B.; Chang, Y.; Zhao, J.; Zhao, X., Small-gain technique-based adaptive fuzzy command filtered control for uncertain nonlinear systems with unmodeled dynamics and disturbances, Internat. J. Adapt. Control Signal Process., 35, 9, 1664-1684, 2021 · Zbl 07840340
[28]	Li, Y.; Liu, Y.; Tong, S., Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints, IEEE Trans. Neural Netw. Learn. Syst., 33, 7, 3131-3145, 2021
[29]	Wen, G.; Li, B.; Niu, B., Optimized backstepping control using reinforcement learning of observer-critic-actor architecture based on fuzzy system for a class of nonlinear strict-feedback systems, IEEE Trans. Fuzzy Syst., 30, 10, 4322-4335, 2022
[30]	Chen, W.; Jiao, L.; Li, J.; Li, R., Adaptive NN backstepping output-feedback control for stochastic nonlinear strict-feedback systems with time-varying delays, IEEE Trans. Syst. Man Cybern. B, 40, 3, 939-950, 2009
[31]	Liu, D.; Huang, Y.; Wang, D.; Wei, Q., Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming, Internat. J. Control, 86, 9, 1554-1566, 2013 · Zbl 1278.93145
[32]	Sui, S.; Chen, C. L.P.; Tong, S., Neural network filtering control design for nontriangular structure switched nonlinear systems in finite time, IEEE Trans. Neural Netw. Learn. Syst., 30, 7, 2153-2162, 2019
[33]	Fei, J.; Ding, H., Adaptive sliding mode control of dynamic system using RBF neural network, Nonlinear Dynam., 70, 1563-1573, 2012
[34]	Han, Q.-L.; Liu, Y.; Yang, F., Optimal communication network-based \(H_\infty\) quantized control with packet dropouts for a class of discrete-time neural networks with distributed time delay, IEEE Trans. Neural Netw. Learn. Syst., 27, 2, 426-434, 2015
[35]	Chen, C. P.; Wen, G.-X.; Liu, Y.-J.; Liu, Z., Observer-based adaptive backstepping consensus tracking control for high-order nonlinear semi-strict-feedback multiagent systems, IEEE Trans. Cybern., 46, 7, 1591-1601, 2015
[36]	Wang, G.; Wang, C.; Li, L.; Du, Q., Distributed adaptive consensus tracking control of higher-order nonlinear strict-feedback multi-agent systems using neural networks, Neurocomputing, 214, 269-279, 2016
[37]	Li, Y.-m.; Min, X.; Tong, S., Adaptive fuzzy inverse optimal control for uncertain strict-feedback nonlinear systems, IEEE Trans. Fuzzy Syst., 28, 10, 2363-2374, 2020
[38]	Liu, Y.; Yao, D.; Li, H.; Lu, R., Distributed cooperative compound tracking control for a platoon of vehicles with adaptive NN, IEEE Trans. Cybern., 52, 7, 7039-7048, 2022
[39]	Ma, J.; Xu, S.; Li, Y.; Chu, Y.; Zhang, Z., Neural networks-based adaptive output feedback control for a class of uncertain nonlinear systems with input delay and disturbances, J. Franklin Inst. B, 355, 13, 5503-5519, 2018 · Zbl 1451.93197
[40]	Wang, M.; Liang, H.; Pan, Y.; Xie, X., A new privacy preservation mechanism and a gain iterative disturbance observer for multiagent systems, IEEE Trans. Netw. Sci. Eng., 1-11, 2023
[41]	Cao, L.; Pan, Y.; Liang, H.; Huang, T., Observer-based dynamic event-triggered control for multiagent systems with time-varying delay, IEEE Trans. Cybern., 53, 5, 3376-3387, 2023
[42]	Chen, L.; Liang, H.; Pan, Y.; Li, T., Human-in-the-loop consensus tracking control for UAV systems via an improved prescribed performance approach, IEEE Trans. Aerosp. Electron. Syst., 1-12, 2023
[43]	Cao, L.; Cheng, Z.; Liu, Y.; Li, H., Event-based adaptive NN fixed-time cooperative formation for multiagent systems, IEEE Trans. Neural Netw. Learn. Syst., 1-11, 2022
[44]	Liu, D.; Huang, Y.; Wang, D.; Wei, Q., Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming, Internat. J. Control, 86, 9, 1554-1566, 2013 · Zbl 1278.93145
[45]	Tong, S.; Sun, K.; Sui, S., Observer-based adaptive fuzzy decentralized optimal control design for strict-feedback nonlinear large-scale systems, IEEE Trans. Fuzzy Syst., 26, 2, 569-584, 2018
[46]	Wen, G.; Xu, L.; Li, B., Optimized backstepping tracking control using reinforcement learning for a class of stochastic nonlinear strict-feedback systems, IEEE Trans. Neural Netw. Learn. Syst., 34, 3, 1291-1303, 2023
[47]	Zhang, H.; Wang, H.; Niu, B.; Zhang, L.; Ahmad, A. M., Sliding-mode surface-based adaptive actor-critic optimal control for switched nonlinear systems with average dwell time, Inform. Sci., 580, 756-774, 2021 · Zbl 07786227
[48]	Wen, G.; Niu, B., Optimized tracking control based on reinforcement learning for a class of high-order unknown nonlinear dynamic systems, Inform. Sci., 606, 368-379, 2022 · Zbl 1533.93302
[49]	Wen, G.; Ge, S. S.; Chen, C. L.P.; Tu, F.; Wang, S., Adaptive tracking control of surface vessel using optimized backstepping technique, IEEE Trans. Cybern., 49, 9, 3420-3431, 2019
[50]	Vamvoudakis, K. G.; Vrabie, D.; Lewis, F. L., Online adaptive algorithm for optimal control with integral reinforcement learning, Internat. J. Robust Nonlinear Control, 24, 17, 2686-2710, 2014 · Zbl 1304.49059

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.