×

Stochastic games with unbounded payoffs: applications to robust control in economics. (English) Zbl 1263.91008

Summary: We study a discounted maxmin control problem with general state space. The controller is unsure about his model in the sense that he also considers a class of approximate models as possibly true. The objective is to choose a maxmin strategy that will work under a range of different model specifications. This is done by dynamic programming techniques. Under relatively weak conditions, we show that there is a solution to the optimality equation for the maxmin control problem as well as an optimal strategy for the controller. These results are applied to the theory of optimal growth and the Hansen-Sargent robust control model in macroeconomics. We also study a class of zero-sum discounted stochastic games with unbounded payoffs and simultaneous moves and give a brief overview of recent results on stochastic games with weakly continuous transitions and the limiting average payoffs.

MSC:

91A15 Stochastic games, stochastic differential games
93D09 Robust stability
91B62 Economic growth models
91B64 Macroeconomic theory (monetary models, models of taxation)
90C47 Minimax problems in mathematical programming
Full Text: DOI

References:

[1] Alvarez F, Stokey N (1998) Dynamic programming with homogeneous functions. J Econ Theory 82:167–189 · Zbl 0910.90266 · doi:10.1006/jeth.1998.2431
[2] Başar T, Bernhard P (1995) H optimal control and related minimax design problems. Birkhäuser, Boston · Zbl 0835.93001
[3] Berge C (1963) Topological spaces. MacMillan, New York · Zbl 0114.13902
[4] Bertsekas DP, Shreve SE (1978) Stochastic optimal control: the discrete-time case. Academic Press, New York · Zbl 0471.93002
[5] Bhattacharya R, Majumdar M (2007) Random dynamical systems: theory and applications. Cambridge University Press, Cambridge · Zbl 1114.37027
[6] Blackwell D (1965) Discounted dynamic programming. Ann Math Stat 36:226–235 · Zbl 0133.42805 · doi:10.1214/aoms/1177700285
[7] Brown LD, Purves R (1973) Measurable selections of extrema. Ann Stat 1:902–912 · Zbl 0265.28003 · doi:10.1214/aos/1176342510
[8] Couwenbergh HAM (1980) Stochastic games with metric state spaces. Int J Game Theory 9:25–36 · Zbl 0432.90097 · doi:10.1007/BF01784794
[9] Durán J (2000) On dynamic programming with unbounded returns. Econom Theory 15:339–352 · Zbl 1101.91339 · doi:10.1007/s001990050016
[10] Durán J (2003) Discounting long-run average growth in stochastic dynamic programs. Econom Theory 22:395–413 · Zbl 1033.90144 · doi:10.1007/s00199-002-0316-5
[11] Fan K (1953) Minimax theorems. Proc Natl Acad Sci USA 39:42–47 · Zbl 0050.06501 · doi:10.1073/pnas.39.1.42
[12] González-Trejo JI, Hernández-Lerma O, Hoyos-Reyes LF (2003) Minimax control of discrete-time stochastic systems. SIAM J Control Optim 41:1626–1659 · Zbl 1045.90083 · doi:10.1137/S0363012901383837
[13] Hansen LP, Sargent TJ (2008) Robustness. Princeton University Press, Princeton
[14] Hansen LP, Sargent TJ (2010) Wanting robustness in macroeconomics. In: Friedman BM, Woodford M (eds) Handbook of monetary economics, vol 3, pp 1097–1157
[15] Hansen LP, Sargent TJ, Tallarini TD (1999) Robust permanent income and pricing. Rev Econ Stud 66:873–907 · Zbl 0943.91048 · doi:10.1111/1467-937X.00112
[16] Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York
[17] Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York · Zbl 0928.93002
[18] Himmelberg CJ (1975) Measurable relations. Fundam Math 87:53–72 · Zbl 0296.28003
[19] Himmelberg CJ, Van Vleck FS (1975) Multifunctions with values in a space of probability measures. J Math Anal Appl 50:108–112 · Zbl 0299.54009 · doi:10.1016/0022-247X(75)90041-4
[20] Iyengar GN (2005) Robust dynamic programming. Math Oper Res 30:257–280 · Zbl 1082.90123 · doi:10.1287/moor.1040.0129
[21] Jaśkiewicz A (2009) Zero-sum ergodic semi-Markov games with weakly continuous transition probabilities. J Optim Theory Appl 141:321–348 · Zbl 1169.91012 · doi:10.1007/s10957-008-9491-2
[22] Jaśkiewicz A (2010) On a continuous solution to the Bellman–Poisson equation in stochastic games. J Optim Theory Appl 145:451–458 · Zbl 1198.91034 · doi:10.1007/s10957-010-9698-x
[23] Jaśkiewicz A, Nowak AS (2006) Zero-sum ergodic stochastic games with Feller transition probabilities. SIAM J Control Optim 45:773–789 · Zbl 1140.91027 · doi:10.1137/S0363012904443257
[24] Jaśkiewicz A, Nowak AS (2011) Discounted dynamic programming with unbounded returns: application to economic models. J Math Anal Appl 378:450–462 · Zbl 1254.90292 · doi:10.1016/j.jmaa.2010.08.073
[25] Kamihigashi T (2007) Stochastic optimal growth with bounded and unbounded utility and with bounded or unbounded shocks. J Math Econ 43:477–500 · Zbl 1154.91032 · doi:10.1016/j.jmateco.2006.05.007
[26] Karatzas I, Sudderth WD (2010) Two characterizations of optimality in dynamic programming. Appl Math Optim 61:421–434 · Zbl 1196.49023 · doi:10.1007/s00245-009-9093-x
[27] Kuratowski K, Ryll-Nardzewski C (1965) A general theorem on selectors. Bull Acad Pol Sci (Ser Math) 13:397–403 · Zbl 0152.21403
[28] Küenle HU (1986) Stochastische Spiele und Entscheidungsmodelle. BG Teubner, Leipzig
[29] Küenle HU (2007) On Markov games with average reward criterion and weakly continuous transition probabilities. SIAM J Control Optim 45:2156–2168 · Zbl 1140.91028 · doi:10.1137/040617303
[30] Kurano M (1987) Minimax strategies for average cost stochastic games with an application to inventory models. J Oper Res Soc Jpn 30:232–247 · Zbl 0619.90099
[31] Le Van C, Morhaim L (2002) Optimal growth models with bounded or unbounded returns: a unifying approach. J Econ Theory 105:158–187 · Zbl 1013.91079 · doi:10.1006/jeth.2001.2880
[32] Ljungqvist L, Sargent TJ (2000) Recursive macroeconomic theory. MIT Press, Cambridge
[33] Maitra A, Parthasarathy T (1970) On stochastic games. J Optim Theory Appl 5:289–300 · Zbl 0181.23204 · doi:10.1007/BF00927915
[34] Maitra A, Sudderth W (1993) Borel stochastic games with limsup payoffs. Ann Probab 21:861–885 · Zbl 0803.90142 · doi:10.1214/aop/1176989271
[35] Maccheroni F, Marinacci M, Rustichini A (2006) Dynamic variational preferences. J Econ Theory 128:4–44 · Zbl 1153.91384 · doi:10.1016/j.jet.2005.12.011
[36] Martins-da-Rocha VF, Vailakis Y (2010) Existence and uniqueness of fixed-point for local contractions. Econometrica 78:1127–1141 · Zbl 1194.91114 · doi:10.3982/ECTA7920
[37] Matkowski J, Nowak AS (2011) On discounted dynamic programming with unbounded returns. Econom Theory 46:455–474 · Zbl 1219.90182 · doi:10.1007/s00199-010-0522-5
[38] Mertens JF, Neyman A (1981) Stochastic games. Int J Game Theory 10:53–66 · Zbl 0486.90096 · doi:10.1007/BF01769259
[39] Meyn SP, Tweedie RL (1993) Markov chains and stochastic stability. Springer, New York · Zbl 0925.60001
[40] Meyn SP, Tweedie RL (1994) Computable bounds for geometric convergence rates of Markov chains. Ann Appl Probab 4:981–1011 · Zbl 0812.60059 · doi:10.1214/aoap/1177004900
[41] Neveu J (1965) Mathematical foundations of the calculus of probability. Holden-Day, San Francisco · Zbl 0137.11301
[42] Nowak AS (1985) Universally measurable strategies in zero-sum stochastic games. Ann Probab 13:269–287 · Zbl 0592.90106 · doi:10.1214/aop/1176993080
[43] Nowak AS (1986) Semicontinuous nonstationary stochastic games. J Math Anal Appl 117:84–99 · Zbl 0594.90105 · doi:10.1016/0022-247X(86)90250-7
[44] Nowak AS (2010) On measurable minimax selectors. J Math Anal Appl 366:385–388 · Zbl 1187.93140 · doi:10.1016/j.jmaa.2010.01.009
[45] Parthasarathy KR (1967) Probability measures on metric spaces. Academic Press, New York · Zbl 0153.19101
[46] Petersen IR, James MR, Dupuis P (2000) Minimax optimal control of stochastic uncertain systems with relative entropy constraints. IEEE Trans Autom Control 45:398–412 · Zbl 0978.93083 · doi:10.1109/9.847720
[47] Puterman M (1980) Markov decision processes: discrete stochastic dynamic programming. Wiley-Interscience, New York · Zbl 0829.90134
[48] Rincón-Zapatero JP, Rodriguez-Palmero C (2003) Existence and uniqueness of solutions to the Bellman equation in the unbounded case. Econometrica 71:1519–1555. Corrigendum Econometrica 77:317–318 · Zbl 1154.49303 · doi:10.1111/1468-0262.00457
[49] Schäl M (1975) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrscheinlichkeitstheor Verw Geb 32:179–196 · Zbl 0316.90080 · doi:10.1007/BF00532612
[50] Shapley LS (1953) Stochastic games. Proc Natl Acad Sci USA 39:1095–1100 · Zbl 0051.35805 · doi:10.1073/pnas.39.10.1095
[51] Stokey NL, Lucas RE, Prescott E (1989) Recursive methods in economic dynamics. Harvard University Press, Cambridge
[52] Strauch R (1966) Negative dynamic programming. Ann Math Stat 37:871–890 · Zbl 0144.43201 · doi:10.1214/aoms/1177699369
[53] Wessels J (1977a) Markov programming by successive approximations with respect to weighted supremum norms. J Math Anal Appl 58:326–335 · Zbl 0354.90087 · doi:10.1016/0022-247X(77)90210-4
[54] Wessels J (1977b) Markov games with unbounded rewards. In: Schäl M (ed) Dynamische Optimierung, Bonn, 1977. Bonner Mathematische Schriften, vol 98, pp 133–147 · Zbl 0385.93038
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.