Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays

588 Accesses
33 Citations
Explore all metrics

Abstract

In this paper, a new dual iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for a class of nonlinear systems with time-delays in state and control variables. The idea is to use the dynamic programming theory to solve the expressions of the optimal performance index function and control. Then, the dual iterative ADP algorithm is introduced to obtain the optimal solutions iteratively, where in each iteration, the performance index function and the system states are both updated. Convergence analysis is presented to prove the performance index function to reach the optimum by the proposed method. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the dual iterative ADP algorithm. Simulation examples are given to demonstrate the validity of the proposed optimal control scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Tracking Control Scheme for Discrete-Time Nonlinear Systems with Approximation Errors

Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach

Article 25 November 2014

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Article 30 July 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
Article MathSciNet MATH Google Scholar
Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to $H_{\infty}$ control. IEEE Trans Syst Cybern Part B Cybern 37(1):240–247
Article Google Scholar
Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern Part B Cybern 38(4):943–949
Article Google Scholar
Basin M, Rodriguez-Gonzalez J (2006) Optimal control for linear systems with multiple time delays in control input. IEEE Trans Autom Control 51(1):91–97
Article MathSciNet Google Scholar
Basin M, Rodriguez-Gonzaleza J, Fridman L (2007) Optimal and robust control for linear state-delay systems. J Franklin Inst 344(7):830–845
Article MathSciNet MATH Google Scholar
Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton, NJ
MATH Google Scholar
Busoniu L, Ernst D, Schutter BD, Babuska R (2010) Approximate dynamic programming with a fuzzy parameterization. Automatica 46(5):804–814
Article MathSciNet MATH Google Scholar
Gao H, Sun W, Shi P (2010) Robust sampled-data $H_{\infty}$ control for vehicle active suspension systems. IEEE Trans Control Syst Technol 18(1):238–245
Article Google Scholar
Chen Z, Jagannathan S (2008) Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discretetime systems. IEEE Trans Neural Netw 19(1):90–106
Article Google Scholar
Chiasson J (2007) Applications of time delay systems. Springer, Berlin
Book MATH Google Scholar
Halpin SM, Harley KA, Jones RA, Taylor LY (2008) Slope-permissive under-voltage load shed relay for delayed voltage recovery mitigation. IEEE Trans Power Syst 23(3):1211–1216
Article Google Scholar
Han M, Han B, Xi J, Hirasawa K (2006) Universal learning network and its application for nonlinear system with long time delay. Comput Chem Eng 31(1):13–20
Article Google Scholar
Hanselmann T, Noakes L, Zaknich A (2007) Continuous-time adaptive critics. IEEE Trans Neural Netw 18(3):631–647
Article Google Scholar
Ho DWC, Li J, Niu Y (2005) Adaptive neural control for a class of nonlinearly parametric time-delay systems. IEEE Trans Neural Netw 16(3):625–635
Article Google Scholar
Huang X, Ma M (2008) Optimal scheduling for minimum delay in passive star coupled WDM optical networks. IEEE Trans Commun 56(8):1324–1330
Article Google Scholar
Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50
Article MathSciNet Google Scholar
Li T, Tong SC, Feng G (2010) A novel robust adaptive-fuzzy-tracking control for a class of nonlinear multi-input/multi-output systems. IEEE Trans Fuzzy Syst 18(1):150–160
Article Google Scholar
Li T, Wang D, Feng G, Tong SC (2010) A DSC approach to robust adaptive NN tracking control for strict-feedback nonlinear systems. IEEE Trans Syst Man Cybern Part B Cybern 40(3):915–927
Article Google Scholar
Li T, Feng , Wang D, Tong S (2010) Neural-network-based simple adaptive control of uncertain multi-input multi-output non-linear systems. IET Control Theory Appl 4(9):1543–1557
Article MathSciNet Google Scholar
Liu D, Zhang Y, Zhang H (2005) A self-learning call admission control scheme for CDMA cellular networks. IEEE Trans Neural Netw 16(5):1219–1228
Article Google Scholar
Malek-Zavarei M, Jashmidi M (1987) Time-delay systems: analysis, optimization and applications. North-Holland, Amsterdam
MATH Google Scholar
Pindyck RS (1992) The distrete-time tracking problem with a time delay in the control. IEEE Trans Autom Control 17(6):397–398
MathSciNet Google Scholar
Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern Part C Appl Rev 32(2):140–153
Article Google Scholar
Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007
Article Google Scholar
Richard JP (2003) Time-delay systems: an overview of some recent advances and open problems. Automatica 39(10):1667–1694
Article MathSciNet MATH Google Scholar
Schenato L (2008) Optimal estimation in networked control systems subject to random delay and packet drop. IEEE Trans Autom Control 53(5):1311–1317
Article MathSciNet Google Scholar
Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276
Article MathSciNet Google Scholar
Silva GJ (2005) PID Controllers for time-delay systems. Birkhuser, Boston, MA
MATH Google Scholar
Song R, Zhang H, Luo Y, Wei Q (2010) Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing 73(16–18):3020–3027
Article Google Scholar
Sun Q, Li Z, Yang J, Luo Y (2010) Load distribution model and voltage static profile of Smart Grid. J Central S Univ Technol 17(4):824–829
Article Google Scholar
Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888
Article MathSciNet MATH Google Scholar
Wang D, Liu D, Wei Q (2012) Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing 78(1):14–22
Article Google Scholar
Wang FY, Jin N, Liu D, Wei Q (2011) Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with $\epsilon$-error bound. IEEE Trans Neural Netw 22(1):24–36
Article Google Scholar
Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47
Article Google Scholar
Watkins C (1989) Learning from delayed rewards. Ph.D. Thesis. Cambridge University, Cambridge
Wei Q, Zhang H, Dai J (2009) Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing 72(7–9):1839–1848
Article Google Scholar
Werbos PJ (1991) A menu of designs for reinforcement learning over time. In: Miller WT, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge, pp 67–95
Google Scholar
Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches ch. 13.. Van Nostrand Reinhold, New York
Google Scholar
Widrow B, Gupta N, Maitra S (1973) Punish/reward: learning with a critic in adaptive threshold systems. IEEE Trans Syst Man Cybern 3:455–465
Article MathSciNet MATH Google Scholar
Yadav V, Padhi R, Balakrishnan SN (2007) Robust/optimal temperature profile control of a high-speed aerospace vehicle using neural networks. IEEE Trans Neural Netw 18(4):1115–1128
Article Google Scholar
Yang Y, Feng G, Ren J (2004) A combined backstepping and small-gain approach to robust adaptive fuzzy control for strict-feedback nonlinear systems. IEEE Trans Syst Man Cybern Part A Syst Humans 34(3):406–420
Article Google Scholar
Zhang H, Basin MV, Skliar M (2007) It$\hat{o}$-Volterra optimal state estimation with continuous, multirate, randomly sampled, and delayed measurements. IEEE Trans Autom Control 52(3):401–416
Article MathSciNet Google Scholar
Zhang H, Quan Y (2001) Modeling, identification and control of a class of nonlinear system. IEEE Trans Fuzzy Syst 9(2):349–354
Article Google Scholar
Zhang H, Wang Y, Liu D (2008) Delay-dependent guaranteed cost control for uncertain stochastic fuzzy systems with multiple time delays. IEEE Trans Syst Man Cybern Part B Cybern 38(1):125–140
Article MATH Google Scholar
Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B Cybern 38(4):937–942
Article Google Scholar
Zhang H, Wei Q, Liu D (2011) An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1):207–214
Article MathSciNet MATH Google Scholar
Zhang H, Song R, Wei Q, Zhang T (2011) Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans Neural Netw 22(12):1851–1862
Article Google Scholar
Zhang H, Yang D, Chai T (2007) Guaranteed cost networked control for T-S fuzzy systems with time delay. IEEE Trans Syst Man Cybern Part C Appl Rev 37(2):160–172
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 60904037, 60921061, and 61034002, in part by Beijing Natural Science Foundation under Grant 4102061, and in part by China Postdoctoral Science Foundation under Grant 201104162.

Author information

Authors and Affiliations

State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China
Qinglai Wei, Ding Wang & Dehua Zhang

Authors

Qinglai Wei
View author publications
You can also search for this author in PubMed Google Scholar
Ding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dehua Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinglai Wei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Q., Wang, D. & Zhang, D. Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays. Neural Comput & Applic 23, 1851–1863 (2013). https://doi.org/10.1007/s00521-012-1188-7

Download citation

Received: 30 April 2012
Accepted: 13 September 2012
Published: 17 October 2012
Issue Date: December 2013
DOI: https://doi.org/10.1007/s00521-012-1188-7

Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimal Tracking Control Scheme for Discrete-Time Nonlinear Systems with Approximation Errors

Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimal Tracking Control Scheme for Discrete-Time Nonlinear Systems with Approximation Errors

Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation