×

Continuous interval type-2 fuzzy Q-learning algorithm for trajectory tracking tasks for vehicles. (English) Zbl 1528.93123

Summary: Trajectory tracking is a fundamental but challenging task for vehicle automation. In addition to the system nonlinearity, the main difficulties in the trajectory tracking task are due to the environmental noise and the model uncertainties under different driving scenarios. Considering the uncertainties in the environment, the reinforcement learning method with continuous action and noise-resistance capability could be a promising way to overcome these issues. In this article, a novel continuous interval type-2 fuzzy Q-learning (CIT2FQL) algorithm is proposed to deal with the trajectory tracking task. By introducing the \(n\)-dimensional interval type-2 fuzzy inference system (\(n\)-D IT2FIS) in fuzzy Q-learning, our proposed method achieves the continuous Q-learning by combining the action interpolation with IT2FIS for the first time. We also proposed a simplified type-reduction method for \(n\)-D IT2FIS to improve the computing efficiency of the proposed method. Moreover, a radial basis function (RBF) layer is chosen as the basis function to achieve the \(q\)-value interpolation. Finally, a trajectory tracking task in a simulation environment is conducted to verify the effectiveness and robustness of the proposed method under different scenarios. The results demonstrate that the proposed method has better robustness and noise-resistance capability while maintaining good tracking performance compared with the state-of-the-art baseline algorithms including double deep Q network (DDQN), proximal policy optimization (PPO), and interval type-2 dynamic fuzzy Q-learning (IT2DFQL).
{© 2022 The Authors. International Journal of Robust and Nonlinear Control published by John Wiley & Sons Ltd.}

MSC:

93C42 Fuzzy control/observation systems
93C85 Automated systems (robots, etc.) in control theory
68T07 Artificial neural networks and deep learning

References:

[1] NeukartF, CompostellaG, SeidelC, Von DollenD, YarkoniS, ParneyB. Traffic flow optimization using a quantum annealer. Front ICT. 2017;4:29.
[2] WalravenE, SpaanMTJ, BakkerB. Traffic flow optimization: a reinforcement learning approach. Eng Appl Artif Intell. 2016;52:203‐212.
[3] ZhaoF, ZengG‐Q, LuK‐D. EnLSTM‐WPEO: short‐term traffic flow prediction by ensemble LSTM, NNCT weight integration, and population extremal optimization. IEEE Trans Veh Technol. 2019;69(1):101‐113.
[4] GuoK, LiX, XieL. Ultra‐wideband and odometry‐based cooperative relative localization with application to multi‐UAV formation control. IEEE Trans Cybern. 2019;50(6):2590‐2603.
[5] Xuan‐MungN, HongSK. Robust adaptive formation control of quadcopters based on a leader-follower approach. Int J Adv Robot Syst. 2019;16(4):1729881419862733.
[6] ChuZ, ZhuD, YangSX. Observer‐based adaptive neural network trajectory tracking control for remotely operated vehicle. IEEE Trans Neural Netw Learn Syst. 2016;28(7):1633‐1645.
[7] WangN, SuS‐F, PanX, YuX, XieG. Yaw‐guided trajectory tracking control of an asymmetric underactuated surface vehicle. IEEE Trans Ind Inform. 2018;15(6):3502‐3513.
[8] CoulterRC. Implementation of the Pure Pursuit Path Tracking Algorithm. Robotics Institute at Carnegie Mellon University; 1992.
[9] SunC, ZhangX, ZhouQ, TianY. A model predictive controller with switched tracking error for autonomous vehicle path tracking. IEEE Access. 2019;7:53103‐53114.
[10] NguyenA‐T, SentouhC, PopieulJ‐C. Fuzzy steering control for autonomous vehicles under actuator saturation: design and experiments. J Franklin Inst. 2018;355(18):9374‐9395. · Zbl 1404.93023
[11] AripinMK, GhazaliR, SamYM, DanapalasingamK, IsmailMF. Uncertainty modelling and high performance robust controller for active front steering control. Proceedings of the 2015 10th Asian Control Conference (ASCC); 2015:1‐6; IEEE.
[12] ZhangC, HuJ, QiuJ, YangW, SunH, ChenQ. A novel fuzzy observer‐based steering control approach for path tracking in autonomous vehicles. IEEE Trans Fuzzy Syst. 2018;27(2):278‐290.
[13] BoraseRP, MaghadeDK, SondkarSY, PawarSN. A review of PID control, tuning methods and applications. Int J Dyn Control. 2020;1‐10.
[14] FayjieAR, HossainS, OualidD, LeeDJ. Driverless car: autonomous driving using deep reinforcement learning in urban environment. Proceedings of the 2018 15th International Conference on Ubiquitous Robots (UR); 2018:896‐901; IEEE.
[15] MorimotoJ, DoyaK. Robust reinforcement learning. Neural Comput. 2005;17(2):335‐359.
[16] MankowitzDJ, LevineN, JeongR, et al. Robust reinforcement learning for continuous control with model misspecification; 2019. arXiv preprint arXiv:1906.07516.
[17] HsuCH, JuangCF. Self‐Organizing Interval Type‐2 Fuzzy Q‐learning for reinforcement fuzzy control. Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics; 2033‐2038; IEEE.
[18] YiZ, LiG, ChenS, XieW, XuB. A navigation method for mobile robots using interval type‐2 fuzzy neural network fitting Q‐learning in unknown environments. J Intell Fuzzy Syst. 2019;37(1):1113‐1121.
[19] XueH, DingD, ZhangZ, WuM, WangH. A fuzzy system of operation safety assessment using multi‐model linkage and multi‐stage collaboration for in‐wheel motor. IEEE Trans Fuzzy Syst. 2021.
[20] CastilloO, Amador‐AnguloL. A generalized type‐2 fuzzy logic approach for dynamic parameter adaptation in bee colony optimization applied to fuzzy controller design. Inf Sci. 2018;460:476‐496.
[21] WangG, JiaQ‐S, QiaoJ, BiJ, LiuC. A sparse deep belief network with efficient fuzzy learning framework. Neural Netw. 2020;121:430‐440.
[22] ZadehLA. Fuzzy sets. Inf Control. 1965;8(3):338‐353. doi:10.1016/S0019-9958(65)90241-X · Zbl 0139.24606
[23] TanakaK, WangHO. Fuzzy Control Systems Design and Analysis: A Linear Matrix Inequality Approach. John Wiley & Sons; 2004.
[24] WuD, MendelJM. Uncertainty measures for interval type‐2 fuzzy sets. Inf Sci. 2007;177(23):5378‐5393. · Zbl 1141.28010
[25] KarnikNN, MendelJM, LiangQ. Type‐2 fuzzy logic systems. IEEE Trans Fuzzy Syst. 1999;7(6):643‐658.
[26] HongJ, TangK, ChenC. Obstacle avoidance of hexapod robots using fuzzy Q‐learning. Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI); 2017:1‐6; IEEE.
[27] GlorennecPY. Fuzzy Q‐learning and dynamical fuzzy Q‐learning. Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference; 1994:474‐479; IEEE.
[28] VinczeD, KovácsS. Incremental Rule Base Creation with Fuzzy Rule Interpolation‐Based Q‐Learning. Springer; 2010:191‐203. · Zbl 1209.68431
[29] HoriuchiT, FujinoA, KataiO, SawaragiT. Fuzzy interpolation‐based Q‐learning with continuous states and actions. Proceedings of IEEE 5th International Fuzzy Systems; 1996:594‐600; IEEE.
[30] KarnikNN, MendelJM. Centroid of a type‐2 fuzzy set. Inf Sci. 2001;132(1‐4):195‐220. · Zbl 0982.03030
[31] WuD, MendelJM. Enhanced Karnik-Mendel algorithms. IEEE Trans Fuzzy Syst. 2008;17(4):923‐934.
[32] MelgarejoM. A fast recursive method to compute the generalized centroid of an interval type‐2 fuzzy set. Proceedings of the NAFIPS 2007‐2007 Annual Meeting of the North American Fuzzy Information Processing Society; 2007:190‐194; IEEE.
[33] DuranK, BernalH, MelgarejoM. Improved iterative algorithm for computing the generalized centroid of an interval type‐2 fuzzy set. Proceedings of the NAFIPS 2008‐2008 Annual Meeting of the North American Fuzzy Information Processing Society; 2008:1‐5; IEEE.
[34] SaxenaV, YadalaN, ChourasiaR, RheeFCH. Type reduction techniques for two‐dimensional interval type‐2 fuzzy sets. Proceedings of the 2017 IEEE International Conference on Fuzzy Systems (FUZZ‐IEEE); 2017:1‐6; IEEE.
[35] Watkins ChristopherJCH, DayanP. \cal Q‐learning. Mach Learn. 1992;8(3‐4):279‐292. · Zbl 0773.68062
[36] SuttonRS, BartoAG. Reinforcement Learning: An Introduction. MIT Press; 2018. · Zbl 1407.68009
[37] AmariS‐i. Backpropagation and stochastic gradient descent method. Neurocomputing. 1993;5(4‐5):185‐196. · Zbl 0782.68094
[38] RajamaniR. Vehicle Dynamics and Control. Springer Science & Business Media; 2011.
[39] DhariwalP, HesseC, KlimovO, et al. OpenAI baselines; 2017https://github.com/openai/baselines
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.