Document Zbl 0999.60019

Hoeffding’s inequality for uniformly ergodic Markov chains. (English) Zbl 0999.60019

Stat. Probab. Lett. 56, No. 2, 143-146 (2002).

The aim of the present mathematical note is to provide a generalization of Hoeffding’s inequality which, in its classical form, gives an exponential bound on partial sums of independent and bounded random variables. The authors propose and prove an extension of the Hoeffding’s inequality to the setting of (uniformly ergodic) Markov chains. The deviation of the partial sums (derived from a Markov chain) from their expectation is particularly useful in situations where the uniform control on the constants involved in the exponential bound is required; in particular, within the analysis of reinforcement learning algorithms.

Reviewer: Neculai Curteanu (Iaşi)

Cited in 2 Reviews

Cited in 46 Documents

MSC:

60E15

Inequalities; stochastic orderings

Keywords:

Hoeffding’s inequality; Markov chains; large deviations; reinforcement learning algorithm

Cite Review PDF

Full Text: DOI

References:

[1]	Asmussen, S.; Glynn, P. W.; Thorisson, H., Stationarity detection in the initial transient problem, ACM Trans. Modeling and Comput. Simulation, 2, 130-157 (1992) · Zbl 0842.68106
[2]	Azuma, K., Weighted sums of certain dependent random variables, Tôhoku Math. J., 19, 3, 357-367 (1967) · Zbl 0178.21103
[3]	Balaji, S., Meyn, S.P., 2000. Multiplicative ergodicity for irreducible Markov chains. Stochastic Process. Appl., 90, 123-144.; Balaji, S., Meyn, S.P., 2000. Multiplicative ergodicity for irreducible Markov chains. Stochastic Process. Appl., 90, 123-144. · Zbl 1046.60065
[4]	Devroye, L.; Györfi, L.; Lugosi, G., A Probabilistic Theory of Pattern Recognition (1996), Springer: Springer New York · Zbl 0853.68150
[5]	Doob, J. L., Stochastic Processes (1953), Wiley: Wiley New York · Zbl 0053.26802
[6]	Hoeffding, W., Probability inequalities for sums of bounded random variables, J. Amer. Statist. Assoc., 58, 13-30 (1963) · Zbl 0127.10602
[7]	Meyn, S. P.; Tweedie, R. L., Markov Chains and Stochastic Stability (1993), Springer: Springer New York · Zbl 0925.60001
[8]	Ormoneit, D., Glynn, P.W., 2001. Kernel-based reinforcement learning in average-cost problems. IEEE Transactions on optimal control, to be published.; Ormoneit, D., Glynn, P.W., 2001. Kernel-based reinforcement learning in average-cost problems. IEEE Transactions on optimal control, to be published. · Zbl 1364.90349
[9]	Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994), Wiley: Wiley New York · Zbl 0829.90134
[10]	Rosenthal, J.S., 1992. Rates of Convergence for Gibbs Sampler and other Markov Chains. Doctoral Dissertation, Harvard University.; Rosenthal, J.S., 1992. Rates of Convergence for Gibbs Sampler and other Markov Chains. Doctoral Dissertation, Harvard University.
[11]	Serfling, R. J., Approximation Theorems of Mathematical Statistics (1980), Wiley: Wiley New York · Zbl 0423.60030

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.