×

Constrained continuous-time Markov decision processes on the finite horizon. (English) Zbl 1370.90285

Summary: This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while \(N\) constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than \(N+1\) deterministic Markov policies.

MSC:

90C40 Markov and semi-Markov decision processes
60J27 Continuous-time Markov processes on discrete state spaces
Full Text: DOI

References:

[1] Altman, E.: Constrained Markov Decision Processes. Chapman & Hall, Boca Raton (1999) · Zbl 0963.90068
[2] Avrachenkov, K., Habachi, O., Piunovskiy, A., Zhang, Y.: Infinite horizon impulsive optimal control with applications to Internet congestion control. Int. J. Control 88, 703-716 (2015) · Zbl 1319.49053 · doi:10.1080/00207179.2014.971436
[3] Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011) · Zbl 1236.90004 · doi:10.1007/978-3-642-18324-9
[4] Bertsekas, D., Nedíc, A., Ozdaglar, A.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003) · Zbl 1140.90001
[5] Feinberg, E.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29, 492-524 (2004) · Zbl 1082.90126 · doi:10.1287/moor.1040.0089
[6] Feinberg, E., Mandava, M., Shiryayev, A.: On solutions of Kolmogorovs equations for nonhomogeneous jump Markov processes. J. Math. Anal. Appl. 411(1), 261-270 (2014) · Zbl 1328.60192 · doi:10.1016/j.jmaa.2013.09.043
[7] Feinberg, E., Rothblum, U.: Splitting randomized stationary policies in total-reward Markov decision processes. Math. Oper. Res. 37, 129-153 (2012) · Zbl 1243.90233 · doi:10.1287/moor.1110.0525
[8] Ghosh, M.K., Saha, S.: Continuous-time controlled jump Markov processes on the finite horizon. In: Optimization, Control, and Applications of Stochastic Systems, pp. 99-109. Birkhäuser, New York (2012) · Zbl 1374.90403
[9] Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes. Springer, Berlin (2009) · Zbl 1209.90002 · doi:10.1007/978-3-642-02547-1
[10] Guo, X.P., Hernández-Lerma, O.: Constrained continuous-time Markov controlled processes with discounted criteria. Stoch. Anal. Appl. 21, 379-399 (2003) · Zbl 1099.90071 · doi:10.1081/SAP-120019291
[11] Guo, X.P., Huang, X.X., Huang, Y.H.: Finite horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv. Appl. Probab. 47, 1-24 (2015) · Zbl 1330.90125 · doi:10.1017/S0001867800049016
[12] Guo, X.P., Huang, Y.H., Song, X.Y.: Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optim. 50, 23-47 (2012) · Zbl 1250.90108 · doi:10.1137/100805169
[13] Guo, X.P., Song, X.Y.: Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21, 2016-2049 (2011) · Zbl 1258.90104 · doi:10.1214/10-AAP749
[14] Guo, X.P., Piunovskiy, A.: Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math. Oper. Res. 36, 105-132 (2011) · Zbl 1218.90209 · doi:10.1287/moor.1100.0477
[15] Guo, X.P., Vykertas, M., Zhang, Y.: Absorbing continuous-time Markov decision processes with total cost criteria. Adv. Appl. Probab. 45, 490-519 (2013) · Zbl 1282.90229 · doi:10.1017/S0001867800006418
[16] Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996) · Zbl 0840.93001 · doi:10.1007/978-1-4612-0729-0
[17] Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999) · Zbl 0928.93002 · doi:10.1007/978-1-4612-0561-6
[18] Huang, Y.H.: Finite horizon continuous-time Markov decision processes with mean and variance criteria. Submitted (2015) · Zbl 0379.93052
[19] Jacod, J.: Multivariate point processes: predictable projection, Radon-Nicodym derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und verwandte Gebiete 31, 235-253 (1975) · Zbl 0302.60032 · doi:10.1007/BF00536010
[20] Kitaev, M.Y., Rykov, V.V.: Controlled Queueing Systems. CRC Press, New York (1995) · Zbl 0876.60077
[21] Miller, B.L.: Finite state continuous time Markov decision processes with a finite planning horizon. SIAM J. Control 6, 266-280 (1968) · Zbl 0162.23302 · doi:10.1137/0306020
[22] Miller, B., Miller, G., Siemenikhin, K.: Towards the optimal control of Markov chains with constraints. Automatica 46, 1495-1502 (2010) · Zbl 1201.93131 · doi:10.1016/j.automatica.2010.06.003
[23] Piunovskiy, A.B.: Optimal Control of Random Sequences in Problems with Constraints. Kluwer Academic, Dordrecht (1997) · Zbl 0894.93001 · doi:10.1007/978-94-011-5508-3
[24] Piunovskiy, A.: A controlled jump discounted model with constraints. Theory Probab. Appl. 42, 51-71 (1998) · doi:10.1137/S0040585X97975964
[25] Piunovskiy, A., Zhang, Y.: Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optim. 49, 2032-2061 (2011) · Zbl 1242.90283 · doi:10.1137/10081366X
[26] Pliska, S.R.: Controlled jump processes. Stoch. Process. Appl. 3, 259-282 (1975) · Zbl 0313.60055 · doi:10.1016/0304-4149(75)90025-3
[27] Prieto-Rumeau, T., Hernández-Lerma, O.: Selected Topics in Continuous-Time Controlled Markov Chains and Markov Games. Imperial College Press, London (2012) · Zbl 1269.60004 · doi:10.1142/p829
[28] Yushkevich, A.A.: Controlled Markov models with countable state and continuous time. Theory Probab. Appl. 22, 215-235 (1977) · Zbl 0379.93052 · doi:10.1137/1122029
[29] Zhang, L.L., Guo, X.P.: Constrained continuous-time Markov decision processes with average criteria. Math. Methods Oper. Res. 67, 323-340 (2008) · Zbl 1143.90033 · doi:10.1007/s00186-007-0154-0
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.