Document Zbl 0639.93068

Hernández-Lerma, Onésimo; Cavazos-Cadena, Rolando

Continuous dependence of stochastic control models on the noise distribution. (English) Zbl 0639.93068

Appl. Math. Optimization 17, No. 1, 79-89 (1988).

A discrete-time stochastic control system is considered. Here the state process depends on a control action and on a noise, which is a sequence of independent identically distributed random elements. Given the initial state and the planning horizon, sufficient conditions for the continuous dependence of the optimal reward function on the common distribution are proved for several reward criteria. This research is motivated by questions arising in problems of adaptive control of stochastic systems with unknown noise distribution, which are discussed briefly in the paper.

Reviewer: Sv.Gaidov

Cited in 3 Documents

MSC:

93E20	Optimal stochastic control
60H99	Stochastic analysis
93C55	Discrete-time control/observation systems
93C40	Adaptive control/observation systems

Keywords:

discrete-time stochastic control; continuous dependence of the optimal reward function on the common distribution

Cite Review PDF

Full Text: DOI

References:

[1]	Berge C (1963) Topological Spaces. Macmillan, New York · Zbl 0114.13902
[2]	Bertsekas DP, Shreve SE (1978) Stochastic Optimal Control. Academic Press, New York
[3]	Cavazos-Cadena R (to appear) Necessary conditions for the optimality of average-reward Markov decision processes
[4]	Dynkin EB, Yushkevich AA (1979) Controlled Markov Processes. Springer-Verlag, Berlin
[5]	Georgin JP (1978) Contrôle des chaînes de Markov sur des espaces arbitraires. Ann Inst H Poincaré 16:255-277 · Zbl 0391.60066
[6]	Georgin JP (1978) Estimation et contrôle des chaînes de Markov sur des espaces arbitraires. Lecture Notes in Mathematics, Vol 636. Springer-Verlag, Berlin, pp 71-113 · Zbl 0372.60094
[7]	Ghosal A (1977) Isomorphic queues. Bull Austral Math Soc 17:275-289 · Zbl 0363.60105 · doi:10.1017/S0004972700010479
[8]	Gordienko EI (1985) Adaptive strategies for certain classes of controlled Markov processes. Theory Probab Appl 29:504-518 · Zbl 0577.93067 · doi:10.1137/1129064
[9]	Gubenko LG, Statland ES (1975) On controlled, discrete-time Markov decision processes. Theory Probab Math Statist 7:47-61
[10]	Hernández-Lerma O (1985) Approximation and adaptive policies in discounted dynamic programming. Bol Soc Mat Mexicana 30:25-35 · Zbl 0641.90087
[11]	Hernández-Lerma O, Marcus SI (1984) Identification and approximation of queueing systems. IEEE Trans Automat Control 29:472-474 · Zbl 0544.93070 · doi:10.1109/TAC.1984.1103564
[12]	Hernández-Lerma O, Marcus SI (to appear) Adaptive policies for discrete-time stochastic control systems with unknown disturbance distribution · Zbl 0637.93075
[13]	Himmelberg CJ, Parthasarathy T, Van Vleck FS (1976) Optimal plans for dynamic programming problems. Math Oper Res 1:390-394 · Zbl 0368.90134 · doi:10.1287/moor.1.4.390
[14]	Hinderer K (1970) Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter. Springer-Verlag, Berlin · Zbl 0202.18401
[15]	Kolonko M (1982) The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter. Math Operationsforsch Statist Ser Optim 13:567-591 · Zbl 0518.90092
[16]	Kolonko M (1983) Bounds for the regret loss in dynamic programming under adaptive control. Z Oper Res 27:13-37 · Zbl 0502.90085 · doi:10.1007/BF01916897
[17]	Kurano M (1985) Average-optimal adaptive policies in semi-Markov decision processes including an unknown parameter. J Oper Res Soc Japan 28:252-266 · Zbl 0579.90098
[18]	Royden HL (1986) Real Analysis. Macmillan, New York · Zbl 0588.00021
[19]	Tijms HC (1975) On dynamic programming with arbitrary state space, compact action space and the average reward as criterion. Report BW 55/75, Mathematisch Centrum, Amsterdam
[20]	Ueno T (1957) Some limit theorems for temporally discrete Markov processes. J Fac Sci Univ Tokyo 7:449-462 · Zbl 0077.33201

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.