Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks

Alexandru Murgu¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 888))

220 Accesses

Abstract

This paper presents a framework for mapping the value-iteration and related successive approximation methods for Markov Decision Problems onto Hopfield neural networks, for both discounted and undiscounted versions of the finite state and action spaces. We analyse the asymptotic behaviour of the control sets and we give some estimates on the convergence rate for the value-iteration scheme. We relate the convergence properties on an energy function which represents the key point in mapping Markov Decision Problems onto Hopfield networks. Finally, an application from queueing systems in communication networks is taken into consideration and the results of computer simulation of Hopfield network running for the equivalent Markov Decision Problem are presented, together with some comments on possible developments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Numerical Approximations for Discounted Continuous Time Markov Decision Processes

Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors

Article 25 January 2015

Approximate Optimal Cost and Policies of First Passage Markov Decision Processes with Countable-State Space and Discount Factors

References

J. Bather: Optimal decision procedures for finite Markov chains. Adv. in Appl. Prob. 5, Part I, 328–339 (1973)
Google Scholar
R. Belman: Applied Dynamic Programming, Princeton University Press, Princeton, NJ, 1962
Google Scholar
D.P. Bertsekas: Constrained Optimization and Lagrange Multiplier Methods. New York: Academic Press 1982
Google Scholar
D.P. Bertesekas and J.N.Tsitsiklis: Parallel and Distributed Computation: Numerical Methods. Englewood Cliffs, NJ: Prentice-Hall 1989
Google Scholar
D.P. Bertsekas and R. Gallager: Data Networks. Englewood Cliffs, NJ: Prentice-Hall 1992
Google Scholar
B. Brown: On the iterative method of dynamic programming on a finite space discrete time Markov process. Ann. Math. Statist 36, 1279–1285 (1965)
Google Scholar
E. V. Denardo: Contraction mappings in the theory underlying dynamic programming. SIAM Rev. 9, 165–177 (1967)
Google Scholar
E.V. Denardo: A Markov decision problem. In: T.C. Hu, S.M. Robinson (Eds.): Mathematical Programming. Academic Press 1973, pp. 33–68
Google Scholar
C. Derman: Finite State Markovian Decision Processes. New York: Academic Press, 1970
Google Scholar
B. Finkbeiner, W. Rungaldier: A value iteration algorithm for Markov renewal programming. In: L. Zadeh (ed.): Computing Methods in optimization Problems 2. New York: Academic Press 1969, pp. 95–104
Google Scholar
N. Hastings: Optimization of discounted Markov decision problems. Op. Res. Quart. 20, 499–500 (1969)
Google Scholar
N. Hastings: Bounds on the gain of a Markov decision process. Op. Res. 19, 240–244 (1971)
Google Scholar
N. Hastings, J. Mello: Tests for suboptimal actions in discounted Markov programming. Man. Sci. 19, 1019–1022 (1973)
Google Scholar
J. Hertz, A. Krogh, P.G. Palmer: Introduction to the Theory of Neural Computation. Redwood City: Addison-Wesley Publishing Company 1991
Google Scholar
R. Howard: Dynamic Programming and Markov Processes, John Wiley, New York, 1960
Google Scholar
H. Kushner, A. Kleinman: Accelerated procedures for the solution of discrete Markov control problems. IEEE Trans. Automatic Control AC-16, 147–152 (1971)
Google Scholar
V. Lakshmikantham, S. Leela, A.A. Marttynyuk: Stability Analysis of Nonlinear Systems. New York: Marcel Dekker, Inc. 1989
Google Scholar
J. MacQueen: A test for suboptimal actions in Markovian decision problems. Op. Res. 15, 559–561 (1967)
Google Scholar
T. Morton, W. Wecker: Discounting, ergodicity and convergence for Markov decision processes. Man. Sci. 23, 890–900 (1977)
Google Scholar
A. Odoni: On finding the maximal gain for Markov decision processes. Op. Res. 17, 857–860 (1969)
Google Scholar
E. Porteus: Bounds and transformations for discounted finite Markov decision chains. Op. Res. 23, 761–784 (1975)
Google Scholar
E. Porteus, S.J. Totten: Accelerated computation of the expected discounted return in a Markov chain. Op. Res. 26, 350–358 (1978)
Google Scholar
D. Reetz: Solution of a Markovian decision problem by successive overrelaxation. Op. Res. 21, 29–32 (1973)
Google Scholar
S.M. Ross: Stochastic Processes. New York: John Wiley & Sons 1983
Google Scholar
M. Schwartz: Telecommunication Networks: Protocols, Modeling and Analysis. Reading, MA: Addison-Wesley Publishing Company 1987
Google Scholar
P.J. Schweitzer: A turnpike theorem for undiscounted Markovian decision processes. ORSA/TIMS National Meeting, May 1968
Google Scholar
P.J. Schweitzer: Iterative solution of the functional equations for undiscounted Markov renewal programming. J.M.A.A. 34, 495–501 (1971)
Google Scholar
J. Shapiro: Turnpike planning horizons for a Markovian decision model. Man. Sci. 14, 292–300 (1968)
Google Scholar
J. Van Nunen: A set of successive approximation methods for discounted Markovian decision problems. Op. Res. 20, 203–209 (1976)
Google Scholar
D. White: Dynamic programming, Markov chains, and the method of successive approximations. J.M.A.A. 6, 373–376 (1963)
Google Scholar
W. Zangwill: Nonlinear Programming. A Unified approach. Englewood Cliffs, NJ: Prentice Hall 1969
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Jyväskylä, FIN-40351, Jyväskylä, Finland
Alexandru Murgu

Authors

Alexandru Murgu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Stig I. Andersson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Murgu, A. (1995). Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks. In: Andersson, S.I. (eds) Analysis of Dynamical and Cognitive Systems. Lecture Notes in Computer Science, vol 888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58843-4_17

Download citation

DOI: https://doi.org/10.1007/3-540-58843-4_17
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58843-6
Online ISBN: 978-3-540-49113-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Numerical Approximations for Discounted Continuous Time Markov Decision Processes

Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors

Approximate Optimal Cost and Policies of First Passage Markov Decision Processes with Countable-State Space and Discount Factors

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Numerical Approximations for Discounted Continuous Time Markov Decision Processes

Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors

Approximate Optimal Cost and Policies of First Passage Markov Decision Processes with Countable-State Space and Discount Factors

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation