A convex analytic approach to Markov decision processes

Vivek S. Borkar¹^nAff2

858 Accesses
67 Citations
1 Altmetric
Explore all metrics

Summary

This paper develops a new framework for the study of Markov decision processes in which the control problem is viewed as an optimization problem on the set of canonically induced measures on the trajectory space of the joint state and control process. This set is shown to be compact convex. One then associates with each of the usual cost criteria (infinite horizon discounted cost, finite horizon, control up to an exit time) a naturally defined occupation measure such that the cost is an integral of some function with respect to this measure. These measures are shown to form a compact convex set whose extreme points are characterized. Classical results about existence of optimal strategies are recovered from this and several applications to multicriteria and constrained optimization problems are briefly indicated.

Avoid common mistakes on your manuscript.

References

Billingsley, P.: Convergence of probability measures. New York; Wiley 1968
Google Scholar
Bertsekas, D.P.: Dynamic Programming and stochastic control. New York: Academic 1976
Google Scholar
Borkar, V.S.: On minimum cost per unit time control of Markov chains, SIAM J. Control Optimization 22, 965–978 (1984)
Google Scholar
Borkar, V.S.: Control of Markov chains with long-run average cost criterion. In: Fleming, W., Lions, P.L. (eds.) Stochastic differential systems, stochastic control theory and applications, IMA vol. 10, pp. 57–77. Berlin Heidelberg New York: Springer 1988
Google Scholar
Phelps, R.: Lectures on Choquet's theorem. New York: Nostrand 1966
Google Scholar
Makowski, A. Schwartz, A.: Implementation issues for Markov decision processes. In: Fleming, W., Lions, P.L. (eds.). Stochastic differential systems, stochastic control theory and applications, IMA vol. 10, pp. 323–337. Berlin Heidelberg New York: Springer 1988
Google Scholar
Ross, S.: Introduction to stochastic dynamic programming. New York: Academic 1984
Google Scholar

Download references

Author information

Vivek S. Borkar
Present address: Bangalore Center, Tata Institute of Fundamental Research, I.I.Sc. Campus, P.O. Box 1234, 560012, Bangalore, India

Authors and Affiliations

Systems Research Center, University of Maryland, 20742, College Park, MD, USA
Vivek S. Borkar

Authors

Vivek S. Borkar
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Research supported by NSF Grant CDR-85-00108

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borkar, V.S. A convex analytic approach to Markov decision processes. Probab. Th. Rel. Fields 78, 583–602 (1988). https://doi.org/10.1007/BF00353877

Download citation

Received: 15 March 1987
Revised: 08 January 1988
Issue Date: August 1988
DOI: https://doi.org/10.1007/BF00353877

A convex analytic approach to Markov decision processes

Summary

Article PDF

Similar content being viewed by others

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Constrained Markov Decision Processes with Non-constant Discount Factor

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A convex analytic approach to Markov decision processes

Summary

Article PDF

Similar content being viewed by others

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Constrained Markov Decision Processes with Non-constant Discount Factor

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation