×

Optimal cost and policy for a Markovian replacement problem. (English) Zbl 0793.90026

Summary: We consider the computation of the optimal cost and policy associated with a two-dimensional Markov replacement problem with partial observations, for two special cases of observation quality. Relying on structural results available for the optimal policy associated with these two particular models, we show that, in both cases, the infinite-horizon, optimal discounted cost function is piecewise linear, and provide formulas for computing the cost and the policy. Several examples illustrate the usefulness of the results.

MSC:

90B25 Reliability, availability, maintenance, inspection in operations research
90C39 Dynamic programming
60K20 Applications of Markov renewal processes (reliability, queueing networks, etc.)
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
Full Text: DOI

References:

[1] ?str?m, K. J.,Optimal Control of Markov Processes with Incomplete State Information, Journal of Mathematical Analysis and Applications, Vol. 10, pp. 174-205, 1965. · Zbl 0137.35803 · doi:10.1016/0022-247X(65)90154-X
[2] White, C. C.,Bounds on the Optimal Cost for a Replacement Problem with Partial Observations, Naval Research Logistics Quarterly, Vol. 26, pp. 415-422, 1979. · Zbl 0496.90038 · doi:10.1002/nav.3800260305
[3] Bertsekas, D. P.,Dynamic Programming, Prentice-Hall, Englewood Cliffs, New Jersey, 1987. · Zbl 0649.93001
[4] Albright, S. C.,Structural Results for Partially Observable Markov Decision Processes, Operations Research, Vol. 27, pp. 1041-1053, 1979. · Zbl 0433.90083 · doi:10.1287/opre.27.5.1041
[5] Ross, S. M.,Quality Control under Markovian Deterioration, Management Science, Vol. 17, pp. 587-596, 1971. · Zbl 0221.90021 · doi:10.1287/mnsc.17.9.587
[6] Sawaki, K.,Transformation of Partially Observable Markov Decision Processes into Piecewise Linear Ones, Journal of Mathematical Analysis and Applications, Vol. 91, pp. 112-118, 1983. · Zbl 0521.90101 · doi:10.1016/0022-247X(83)90096-3
[7] Sondik, E. J.,The Optimal Control of Partially Observable Markov Processes, PhD Thesis, Department of Electrical Engineering Systems, Stanford University, 1971. · Zbl 0379.60067
[8] Sondik, E. J.,The Optimal Control of Partially Observable Markov Decision Processes over the Infinite Horizon: Discounted Costs, Operations Research, Vol. 26, pp. 282-304, 1978. · Zbl 0379.60067 · doi:10.1287/opre.26.2.282
[9] Wang, R. C.,Computing Optimal Control Policies?Two Actions, Journal of Applied Probability, Vol. 13, pp. 826-832, 1976. · Zbl 0374.90080 · doi:10.2307/3212542
[10] Wang, R. C.,Optimal Replacement Policy with Unobservable States, Journal of Applied Probability, Vol. 14, pp. 340-348, 1977. · Zbl 0386.60062 · doi:10.2307/3213004
[11] White, C. C.,A Markov Quality Control Process Subject to Partial Observation, Management Science, Vol. 23, pp. 843-852, 1977. · Zbl 0353.93060 · doi:10.1287/mnsc.23.8.843
[12] White, C. C.,Optimal Inspection and Repair of a Production Process Subject to Deterioration, Journal of the Operational Research Society, Vol. 29, pp. 235-243, 1978. · Zbl 0377.90050
[13] Lovejoy, W. S.,Computationally Feasible Bounds for Partially Observed Markov Decision Processes, Research Paper No. 1024, Graduate School of Business, Stanford University, 1988.
[14] White, C. C., andScherer, W. T.,Solution Procedures for Partially Observed Markov Decision Processes, Operations Research, Vol. 37, pp. 791-797, 1989. · Zbl 0684.90099 · doi:10.1287/opre.37.5.791
[15] Hernandez-Lerma, O., andMarcus, S. I.,Adaptive Control of Discrete Discounted Markov Decision Chains, Journal of Optimization Theory and Applications, Vol. 46, pp. 227-235, 1985. · Zbl 0543.90093 · doi:10.1007/BF00938426
[16] Hernandez-Lerma, O., andMarcus, S. I.,Adaptive Control of Markov Processes with Incomplete State Information and Unknown Parameters, Journal of Optimization Theory and Applications, Vol. 52, pp. 227-241, 1987. · Zbl 0585.90090 · doi:10.1007/BF00941283
[17] Monahan, G. E.,Optimal Stopping in a Partially Observable Binary-Valued Markov Chain with Costly Perfect Information, Journal of Applied Probability, Vol. 19, pp. 72-81, 1982. · Zbl 0482.60043 · doi:10.2307/3213917
[18] Kumar, P. R., andSeidman, T. I.,On the Optimal Solution of the One-Armed Bandit Adaptive Control Problem, IEEE Transactions on Automatic Control, Vol. 26, pp. 1176-1184, 1981. · Zbl 0475.90087 · doi:10.1109/TAC.1981.1102790
[19] Thomas, L. C., Jacobs, P. A., andGaver, D. P.,Optimal Inspection Policies for Standby Systems, Communications in Statistics?Stochastic Models, Vol. 3, pp. 259-273, 1987. · Zbl 0634.60079 · doi:10.1080/15326348708807056
[20] Pollock, S. M.,Minimum-Cost Checking Using Imperfect Information, Management Science, Vol. 13, pp. 454-465, 1967. · doi:10.1287/mnsc.13.7.454
[21] Hughes, J. S.,Optimal Internal Audit Timing, Accounting Review, Vol. 52, pp. 56-68, 1977.
[22] Hughes, J. S.,A Note on Quality Control under Markovian Deterioration, Operations Research, Vol. 28, pp. 421-424, 1980. · Zbl 0426.90038 · doi:10.1287/opre.28.2.421
[23] Bertsekas, D. P., andShreve, S. E.,Stochastic Optimal Control: The Discrete Time Case, Academic Press, New York, New York, 1978. · Zbl 0471.93002
[24] Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On The Adaptive Control of a Partially Observable Markov Decision Process, Proceedings of the 27th Conference on Decision and Control, Austin, Texas, pp. 1204-1210, 1988.
[25] Monahan, G. E.,A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms, Management Science, Vol. 28, pp. 1-16, 1982. · Zbl 0486.90084 · doi:10.1287/mnsc.28.1.1
[26] White, C. C., andWhite, D. J.,Markov Decision Processes, European Journal of Operational Research, Vol. 39, pp. 1-16, 1989. · Zbl 0677.90086 · doi:10.1016/0377-2217(89)90348-2
[27] Sawaki, K., andIchikawa, A.,Optimal Control for Partially Observable Markov Decision Processes over an Infinite Horizon, Journal of the Operations Research Society of Japan, Vol. 21, pp. 1-15, 1978. · Zbl 0386.49008
[28] Martin, J. J.,Bayesian Decision Problems and Markov Chains, John Wiley and Sons, New York, New York, 1967. · Zbl 0164.50102
[29] Sernik, E. L., andMarcus, S. I.,Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process, Proceedings of the 27th Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 935-944, 1989.
[30] Sernik, E. L., andMarcus, S. I.,On the Computation of the Optimal Cost Function for Discrete Time Markov Models with Partial Observations, Annals of Operations Research, Vol. 29, pp. 471-512, 1991. · Zbl 0717.90095 · doi:10.1007/BF02283611
[31] Federgruen, A., andSchweitzer, P. J.,Discounted and Undiscounted Value Iteration in Markov Decision Problems: A Survey, Dynamic Programming and Its Applications, Edited by M. Puterman, Academic Press, New York, New York, pp. 23-52, 1979.
[32] Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On Partially Observable Markov Decision Processes with an Average Cost Criterion, Proceedings of the 28th Conference on Decision and Control, Tampa, Florida, pp. 1267-1272, 1989.
[33] Andriyanov, V. A., Kogan, I. A., andUmnov, G. A.,Optimal Control of a Partially Observable Discrete Markov Process, Automation and Remote Control, Vol. 4, pp. 555-561, 1980. · Zbl 0469.90085
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.