×

Self-learning-based optimal tracking control of an unmanned surface vehicle with pose and velocity constraints. (English) Zbl 1527.93320

Summary: In this article, subject to both pose and velocity constraints within narrow waters, a self-learning-based optimal tracking control (SLOTC) scheme is innovatively created for an unmanned surface vehicle (USV) by deploying actor-critic reinforcement learning (RL) mechanism and backstepping technique. To be specific, the barrier Lyapunov function (BLF) is devised to uniformly limit the states within a predefined region pertaining to a smoothly feasible reference trajectory. By virtue of a constrained Hamilton-Jacobi-Bellman (HJB) function, an actor-critic control structure under backstepping is established by employing adaptive neural network identifiers which recursively updates actor and critic, simultaneously. Eventually, theoretical analysis proves that the entire SLOTC scheme can render all the states remain in the predefined compact set while tracking errors converge to an arbitrarily small neighborhood of the origin. Simulation results on a prototype USV demonstrate remarkable effectiveness and superiority.
{© 2022 John Wiley & Sons Ltd.}

MSC:

93C85 Automated systems (robots, etc.) in control theory
93D30 Lyapunov and storage functions
68T05 Learning and adaptive systems in artificial intelligence
49L12 Hamilton-Jacobi equations in optimal control and differential games
Full Text: DOI

References:

[1] WangN, KarimiHR, LiH, SuS. Accurate trajectory tracking of disturbed surface vehicles: a finite‐time control approach. IEEE/ASME Trans Mechatron. 2019;24(3):1064‐1074.
[2] WangN, AhnCK. Coordinated trajectory tracking control of a marine aerial‐surface heterogeneous system. IEEE/ASME Trans Mechatron. 2021. doi:10.1109/TMECH.2021.3055450
[3] WangN, GaoY, SunZ, ZhengZ. Nussbaum‐based adaptive fuzzy tracking control of unmanned surface vehicles with fully unknown dynamics and complex input nonlinearities. Int J Fuzzy Syst. 2018;20(1):259‐268.
[4] WangN, SuS. Finite‐time unknown observer based interactive trajectory tracking control of asymmetric underactuated surface vehicles. IEEE Trans Contr Syst Technol. 2021;29(2):794‐803.
[5] WangN, GaoY, YangC, ZhangXF. Reinforcement learning‐based finite‐time tracking control of an unknown unmanned surface vehicle with input constraints. Neurocomputing. 2021. doi:10.1016/j.neucom.2021.04.133
[6] MichaelaD, ReinhardM, MirkoT. Computer based assistance for manoeuvring ships in restricted waters. IFAC Proc Vol. 2006;39(12):176‐180.
[7] ZhouY. Self‐Learning Based Intelligent Control of Ship Manoeuvring in Narrow Waters. University of Southampton; 2004.
[8] ZhangR, ChenY, SunZ, SunF, XuH. Path control of a surface ship in restricted waters using sliding mode. IEEE Trans Contr Syst Technol. 2000;8(4):722‐732.
[9] TanG, ZouJ, ZhuangJ, WanL, SunH, SunZ. Fast marching square method based intelligent navigation of the unmanned surface vehicle swarm in restricted waters. Appl Ocean Res. 2020;95:102018.
[10] KiumarsiB, LewisFL, LevineDS. Optimal control of nonlinear discrete time‐varying systems using a new neural network approximation structure. Neurocomputing. 2015;156:157‐165.
[11] LiuT, ZouY, LiuD. Reinforcement learning of adaptive energy management with transition probability for a hybrid electric tracked vehicle. IEEE Trans Ind Electron. 2015;62(12):7837‐7846.
[12] LiX, ChengZ, WangB. Attitude control with auxiliary structure based on adaptive dynamic programming for reentry vehicles. Complexity. 2020;4:1‐19. · Zbl 1445.93016
[13] JiangH, ZhangH, XieX. Critic‐only adaptive dynamic programming algorithms applications to the secure control of cyber‐physical systems. ISA Trans. 2020;104:138‐144.
[14] WenG, GeSS, ChenCLP, TuF, WangS. Adaptive tracking control of surface vessel using optimized backstepping technique. IEEE Trans Cybern. 2018;49(9):3420‐3431.
[15] BaiY, CaoY, LiT. Optimized backstepping design for ship course following control based on actor‐critic architecture with input saturation. IEEE Access. 2019;7:73516‐73528.
[16] WerbosPJ. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph. D. thesis. Cambridge, MA: Harvard University; 1974.
[17] WerbosPJ. Advanced forecasting methods for globalcrisis warning and models of intelligence. Gen Syst Yearbook. 1997;22:25‐38.
[18] WerbosPJ. Approximate dynamic programming for real‐time control and neural modeling. In: WhiteDA (ed.), SofgeDA (ed.), eds. Handbook of Intelligent Control. Van Nostrand Reinhold; 1992.
[19] VamvoudakisK, VrabieD, LewisFL. Online adaptive algorithm for optimal control with integral reinforcement learning. Int J Robust Nonlin Control. 2014;24(17):2686‐2710. · Zbl 1304.49059
[20] VamvoudakisK, LewisFL. Online actor‐critic algorithm to solve the continuous‐time infinite horizon optimal control problem. Automatica. 2010;46(5):878‐888. · Zbl 1191.49038
[21] VamvoudakisK, VrabieD, LewisFL. Online policy iteration based algorithms to solve the continuous‐time infinite horizon optimal control problem. Proceedings of the IEEE Symposium ADPRL; 2009.
[22] LiuD, ZhangH. A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst. 2005;10(1):21‐32.
[23] BhasinS, KamalapurkarR, JohnsonM, VamvoudakisK, LewisFL, DixonWE. A novel actor‐critic‐identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica. 2013;49(1):82‐92. · Zbl 1257.93055
[24] SongR, LewisFL, WeiQ, ZhangH. Off‐policy actor‐critic structure for optimal control of unknown systems with disturbances. IEEE Trans Cybern. 2016;46(5):1041‐1050.
[25] WangN, GaoY, ZhaoH, AhnCK. Reinforcement learning‐based optimal tracking control of an unknown unmanned surface vehicle. IEEE Trans Neural Netw Learn Syst. 2021;32(7):3034‐3045.
[26] WangN, GaoY, ZhangX. Data‐driven performance‐prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Trans Neural Netw Learn Syst. 2021. doi:10.1109/TNNLS.2021.3056444
[27] ZhaoD, LiuJ. Control of VTOL aircraft with position state constraints using the barrier Lyapunov function. Asian J Control. 2020;22(3):1221‐1229. · Zbl 07872660
[28] LiDJ, LiuL, LiuYJ, TongS, ChenCLP. Adaptive NN control without feasibility conditions for nonlinear state constrained stochastic systems with unknown time delays. IEEE Trans Cybern. 2019;49(12):4485‐4494.
[29] LiDJ, LiuL, LiuYJ, TongS, ChenCLP. Fuzzy approximation‐based adaptive control of nonlinear uncertain state constrained systems with time‐varying delays. IEEE Trans Fuzzy Syst. 2020;28(8):1620‐1630.
[30] MinX, LiYJ, TongS. Adaptive fuzzy optimal control for a class of active suspension systems with full‐state constraints. IET Intell Transp Syst. 2020;14(5):371‐381.
[31] WuW, TongS, LiYM. Fuzzy adaptive tracking control for switched nonlinear systems with full time‐varying state constraints. Neurocomputing. 2019;352:1‐11.
[32] LiuYJ, TongS. Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems. Automatica. 2017;76:143‐152. · Zbl 1352.93062
[33] TeeKP, GeSS. Control of nonlinear systems with partial state constraints using a barrier Lyapunov function. Int J Control. 2011;84(12):2008‐2023. · Zbl 1236.93099
[34] KimBS, YooSJ. Adaptive control of nonlinear pure‐feedback systems with output constraints: integral barrier Lyapunov functional approach. Int J Control Autom Syst. 2015;13(1):249‐256.
[35] LiuYJ, TongS, ChenCLP, LiDJ. Adaptive NN control using integral barrier Lyapunov functionals for uncertain nonlinear block‐triangular constraint systems. IEEE Trans Cybern. 2017;47(1):3747‐3757.
[36] WangL, ChenCLP, LiH. Event‐triggered adaptive control of saturated nonlinear systems with time‐varying partial state constraints. IEEE Trans Cybern. 2020;50(4):1485‐1497.
[37] LiuYJ, MaL, LiuL, TongS, ChenCLP. Adaptive neural network learning controller design for a class of nonlinear systems with time‐varying state constraints. IEEE Trans Neural Netw Learn Syst. 2020;31(1):66‐75.
[38] GaoY, TongS, LiY. Observer‐based adaptive fuzzy output constrained control for MIMO nonlinear systems with unknown control directions. Fuzzy Sets Syst. 2016;290(1):79‐99. · Zbl 1374.93211
[39] KhalilHK. Nonlinear Systems. 3rd ed.Prentice‐Hall; 2002. · Zbl 1003.34002
[40] SkjetneR, FossenTI, KokotovicPV. Adaptive maneuvering, with experiments, for a model ship in a marine control laboratory. Automatica. 2005;41(2):289‐298. · Zbl 1096.93026
[41] GeSS, WangC. Direct adaptive NN control of a class of nonlinear systems. IEEE Trans Neural Netw. 2002;13(1):214‐221.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.