Addentum to “On an index policy for restless bandits”. (English) Zbl 0727.90090
Summary: We show that the fluid approximation to P. Whittle’s index policy [in: A celebration of applied probability, J. Appl. Probab., Spec. Vol. 25A, 287-298 (1988; Zbl 0664.90043)] for restless bandits has a globally asymptotically stable equilibrium point when the bandits move on just three states. It follows that in this case the index policy is asymptotic optimal.
MSC:
90C40 | Markov and semi-Markov decision processes |