×

Robust min-max optimal control design for systems with uncertain models: a neural dynamic programming approach. (English) Zbl 1443.49031

Summary: The design of an artificial neural network (ANN) based sub-optimal controller to solve the finite-horizon optimization problem for a class of systems with uncertainties is the main outcome of this study. The optimization problem considers a convex performance index in the Bolza form. The dynamic uncertain restriction is considered as a linear system affected by modeling uncertainties, as well as by external bounded perturbations. The proposed controller implements a min-max approach based on the dynamic neural programming approximate solution. An ANN approximates the Value function to get the estimate of the Hamilton-Jacobi-Bellman (HJB) equation solution. The explicit adaptive law for the weights in the ANN is obtained from the approximation of the HJB solution. The stability analysis based on the Lyapunov theory yields to confirm that the approximate Value function serves as a Lyapunov function candidate and to conclude the practical stability of the equilibrium point. A simulation example illustrates the characteristics of the sub-optimal controller. The comparison of the performance indexes obtained with the application of different controllers evaluates the effect of perturbations and the sub-optimal solution.

MSC:

49K35 Optimality conditions for minimax problems
35F21 Hamilton-Jacobi equations
90C39 Dynamic programming
Full Text: DOI

References:

[1] Azhmyakov, V., On the geometric aspects of the invariant ellipsoid method: Application to the robust control design, (2011 50th IEEE conference on decision and control and European control conference (2011)), 1353-1358
[2] Azhmyakov, V.; Mera, M.; Juárez, R., Advances in attractive ellipsoid method for robust control design, International Journal of Robust and Nonlinear Control, 29, 5, 1418-1436 (2019) · Zbl 1410.93035
[3] Barron, A. R., Universal approximation bounds for superpositions of a sigmoidal function, IEEE Transactions on Information Theory, 39, 3, 930-945 (1993) · Zbl 0818.68126
[4] Beard, R. W.; Saridis, G. N.; Wen, J. T., Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation, Automatica, 33, 12, 2159-2177 (1997) · Zbl 0949.93022
[5] Beard, R. W.; Saridis, G. N.; Wen, J. T., Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation, Journal of Optimization Theory and Applications, 96, 3, 589-626 (1998) · Zbl 0916.49021
[6] Bertsekas, D. P., Dynamic programming and optimal control, Vol. 1 (1995), Athena scientific Belmont: Athena scientific Belmont MA · Zbl 0904.90170
[7] Bertsekas, D. P.; Tsitsiklis, J. N., Neuro-dynamic programming, Vol. 5 (1996), Athena Scientific Belmont: Athena Scientific Belmont MA · Zbl 0924.68163
[8] Bryson, A. E., Applied optimal control: Optimization, estimation and control (1975), CRC Press
[9] Çimen, T., State-dependent riccati equation (SDRE) control: A survey, IFAC Proceedings Volumes, 41, 2, 3761-3775 (2008), 17th IFAC World Congress
[10] Cheng, T.; Lewis, F. L.; Abu-Khalaf, M., A neural network solution for fixed-final time optimal control of nonlinear systems, Automatica, 43, 3, 482-490 (2007) · Zbl 1137.93331
[11] Cloutier, J. R., & Cockburn, J. C. (2001). The state-dependent nonlinear regulator with state constrains. In Proceedings.
[12] Cybenko, G., Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems, 2, 4, 303-314 (1989) · Zbl 0679.94019
[13] Edwards, C.; Spurgeon, S., Sliding mode control: theory and applications (1998), CRC Press
[14] Gu, K.; Zohdy, M. A.; Loh, N. K., Necessary and sufficient conditions of quadratic stability of uncertain linear systems, IEEE Transactions on Automatic Control, 35, 5, 601-604 (1990) · Zbl 0705.93064
[15] Guliyev, N. J.; Ismailov, V. E., On the approximation by single hidden layer feedforward neural networks with fixed weights, Neural Networks, 98, 296-304 (2018) · Zbl 1437.68062
[16] Haddad, W. M.; Chellaboina, V., Nonlinear dynamical systems and control: a Lyapunov-based approach (2011), Princeton University Press · Zbl 1256.93003
[17] Heydari, A.; Balakrishnan, S. N., Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics, IEEE Transactions on Neural Networks and Learning Systems, 24, 1, 145-157 (2013)
[18] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, Neural Networks, 2, 5, 359-366 (1989) · Zbl 1383.92015
[19] Huang, C.-S.; Wang, S.; Teo, K., Solving Hamilton—Jacobi—Bellman equations by a modified method of characteristics, Nonlinear Analysis. Theory, Methods & Applications, 40, 1, 279-293 (2000) · Zbl 0959.49021
[20] Jiang, Y.; Jiang, Z. P., Robust adaptive dynamic programming and feedback stabilization of nonlinear systems, IEEE Transactions on Neural Networks and Learning Systems, 25, 5, 882-893 (2014)
[21] Kainen, P. C.; Kurkova, V.; Sanguineti, M., Dependence of computational models on input dimension: Tractability of approximation and optimization tasks, IEEE Transactions on Information Theory, 58, 2, 1203-1214 (2012) · Zbl 1365.68373
[22] Khalil, H. K., Noninear systems, Vol. 2, 35-86 (1996), Prentice-Hall: Prentice-Hall New Jersey
[23] Kim, Y. H.; Lewis, F. L.; Dawson, D. M., Intelligent optimal control of robotic manipulators using neural networks, Automatica, 36, 9, 1355-1364 (2000) · Zbl 1002.93039
[24] Kirk, D., Optimal control theory: An introduction, Dover books on electrical engineering series (2004), Dover Publications
[25] Kiumarsi, B.; Lewis, F. L.; Modares, H.; Karimpour, A.; Naghibi-Sistani, M.-B., Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica, 50, 4, 1167-1175 (2014) · Zbl 1417.93134
[26] Lewis, F. L.; Liu, D., Reinforcement learning and approximate dynamic programming for feedback control, Vol. 17 (2013), John Wiley & Sons
[27] Lewis, F. L.; Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine, 9, 3, 32-50 (2009)
[28] Liberzon, D., Calculus of variations and optimal control theory: A concise introduction (2012), Princeton University Press · Zbl 1239.49001
[29] Masmoudi, N. K.; Rekik, C.; Djemel, M.; Derbel, N., Two coupled neural-networks-based solution of the hamilton-Jacobi-Bellman equation, Applied Soft Computing, 11, 3, 2946-2963 (2011)
[30] Modares, H.; Lewis, F. L.; Naghibi-Sistani, M.-B., Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems, Automatica, 50, 1, 193-202 (2014) · Zbl 1298.49042
[31] Modares, H.; Lewis, F. L.; Sistani, M.-B. N., Online solution of nonquadratic two-player zero-sum games arising in the \(H_\infty\) control of constrained input systems, International Journal of Adaptive Control and Signal Processing, 28, 3-5, 232-254 (2014) · Zbl 1331.93055
[32] Mulje, S. D.; Nagarele, R. M., LQR technique based second order sliding mode control for linear uncertain systems, International Journal of Computer Applications, 137, 7, 23-29 (2016)
[33] Palanisamy, M.; Modares, H.; Lewis, F. L.; Aurangzeb, M., Continuous-time Q-learning for infinite-horizon discounted cost linear quadratic regulator problems, IEEE Transactions on Cybernetics, 45, 2, 165-176 (2015)
[34] Patan, K., Two stage neural network modelling for robust model predictive control, ISA Transactions, 72, 56-65 (2018)
[35] Powell, W. B., Approximate dynamic programming: Solving the curses of dimensionality, Vol. 703 (2007), John Wiley & Sons · Zbl 1156.90021
[36] Poznyak, A. S.; Boltyanski, V. G., (Bašar, T., The robust maximum principle (2012), Springer Science & Business Media) · Zbl 1239.49002
[37] Poznyak, A.; Polyakov, A.; Azhmyakov, V., Attractive ellipsoids in robust control (2014), Springer · Zbl 1314.93006
[38] Qu, Z., Robust control of nonlinear uncertain systems under generalized matching conditions, Automatica, 29, 4, 985-998 (1993) · Zbl 0776.93041
[39] Saad, W.; Sellami, A.; Garcia, G., \( H_\infty \) -sliding mode control of one-sided Lipschitz nonlinear systems subject to input nonlinearities and polytopic uncertainties, ISA Transactions, 90, 19-29 (2019)
[40] Sage, A. P., Optimum systems control (1968), Prentice-Hall · Zbl 0192.51502
[41] Spong, M. W.; Hutchinson, S.; Vidyasagar, M., Robot modeling and control, (Control, Vol. 141 (2006), John Wiley and Sons, Inc.), 419
[42] Sutton, R. S.; Barto, A. G., Reinforcement learning an introduction (2012), The MIT Press
[43] Tang, Z. L.; Ge, S. S.; Tee, K. P.; He, W., Robust adaptive neural tracking control for a class of perturbed uncertain nonlinear systems with state constraints, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 46, 12, 1618-1629 (2016)
[44] Utkin, V.; Guldner, J.; Shi, J., Sliding mode control in electro-mechanical systems, Vol. 34 (2009), CRC press
[45] Vamvoudakis, K. G.; Vrabie, D.; Lewis, F. L., Online adaptive algorithm for optimal control with integral reinforcement learning, International Journal of Robust and Nonlinear Control, 24, 17, 2686-2710 (2014) · Zbl 1304.49059
[46] Vrabie, D.; Lewis, F., Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems, Neural Networks, 22, 3, 237-246 (2009), Goal-Directed Neural Systems · Zbl 1335.93068
[47] Wang, D., Intelligent critic control with robustness guarantee of disturbed nonlinear plants, IEEE Transactions on Cybernetics (2019)
[48] Wang, S.; Jennings, L. S.; Teo, K. L., Numerical solution of Hamilton-Jacobi-Bellman equations by an upwind finite volume method, Journal of Global Optimization, 27, 2, 177-192 (2003) · Zbl 1047.49026
[49] Wang, D.; Liu, D.; Li, H.; Ma, H., Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming, Information Sciences, 282, 167-179 (2014) · Zbl 1354.93045
[50] Wang, D.; Liu, D.; Zhang, Y.; Li, H., Neural network robust tracking control with adaptive critic framework for uncertain nonlinear systems, Neural Networks, 97, 11-18 (2018) · Zbl 1441.93066
[51] Wang, D.; Qiao, J., Approximate neural optimal control with reinforcement learning for a torsional pendulum device, Neural Networks, 117, 1-7 (2019) · Zbl 1443.93052
[52] Xue, A.; Nan, J.; Youxian, S., Robust guaranteed cost control with \(H_\infty - \gamma\) disturbance attenuation performance, (Proceedings of the 2001 American control conference. (Cat. No.01CH37148), Vol. 6 (2001)), 4218-4219
[53] Yang, X.; He, H., Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances, Neural Networks, 99, 19-30 (2018) · Zbl 1441.93067
[54] Yang, Q.; Jagannathan, S.; Sun, Y., Robust integral of neural network and error sign control of MIMO nonlinear systems, IEEE Transactions on Neural Networks and Learning Systems, 26, 12, 3278-3286 (2015)
[55] Yu, Y., On stabilizing uncertain linear delay systems, Journal of Optimization Theory and Applications, 41, 3, 503-508 (1983) · Zbl 0501.93051
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.