Finite-state approximations and adaptive control of discounted Markov decision processes with unbounded rewards. (English) Zbl 0678.93065
The author deals with denumerable state, discounted, unbounded rewards Markov decision processes which depend on unknown parameters. Especially, he considers the problem to determine
- a finite-state iterative method to find the optimal total expected discounted reward corresponding to the true parameter value,
- adaptive policies with asymptotic optimality properties.
To get this he follows a former paper on a similar topic for the finite- state bounded rewards case.
- a finite-state iterative method to find the optimal total expected discounted reward corresponding to the true parameter value,
- adaptive policies with asymptotic optimality properties.
To get this he follows a former paper on a similar topic for the finite- state bounded rewards case.
Reviewer: V.Kankova
MSC:
93E03 | Stochastic systems in control theory (general) |
93E10 | Estimation and detection in stochastic control theory |
93C40 | Adaptive control/observation systems |
93E20 | Optimal stochastic control |
90C40 | Markov and semi-Markov decision processes |