×

Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes. (English) Zbl 1320.60060

Summary: We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of noncontracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains, which supersedes some of the known results and easily extends to other processes such as Markov trees. As applications, we provide a Dvoretzky-Kiefer-Wolfowitz-type inequality (cf.[A. Dvoretzky et al., Ann. Math. Stat. 27, 642–669 (1956; Zbl 0073.14603)]) and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces.

MSC:

60E15 Inequalities; stochastic orderings
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)

Citations:

Zbl 0073.14603

References:

[1] Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Prob. 13, 1000-1034. · Zbl 1190.60010 · doi:10.1214/EJP.v13-521
[2] Adamczak, R. and Bednorz, W. (2012). Exponential concentration inequalities for additive functionals of Markov chains. Preprint. Available at http://arxiv.org/abs/1201.3569v1. · Zbl 1364.60028
[3] Anandkumar, A., Hsu, D. and Kakade, S. M. (2012). A method of moments for mixture models and hidden Markov models. In Proc. 25th Annual Conf. Learning Theory (Edinburgh, June 2012), 34 pp.
[4] Berend, D. and Kontorovich, A. (2013). A sharp estimate of the binomial mean absolute deviation with applications. Statist. Prob. Lett. 83, 1254-1259. · Zbl 1268.60021 · doi:10.1016/j.spl.2013.01.023
[5] Bobkov, S. G. and Götze, F. (2010). Concentration of empirical distribution functions with applications to non-i.i.d. models. Bernoulli 16, 1385-1414. · Zbl 1207.62106 · doi:10.3150/10-BEJ254
[6] Brémaud, P. (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues . Springer, New York. · Zbl 0949.60009
[7] Chazottes, J.-R. and Redig, F. (2009). Concentration inequalities for Markov processes via coupling. Electron. J. Prob. 14, 1162-1180. · Zbl 1191.60023
[8] Chazottes, J.-R., Collet, P., Külske, C. and Redig, F. (2007). Concentration inequalities for random fields via coupling. Prob. Theory Relat. Fields 137, 201-225. · Zbl 1111.60070 · doi:10.1007/s00440-006-0026-1
[9] Chung, K.-M., Lam, H., Liu, Z. and Mitzenmacher, M. (2012). Chernoff-Hoeffding bounds for Markov chains: generalized and simplified. In 29th Internat. Symp. Theoret. Aspects Comput. Sci. , Schloss Dagstuhl, Wadern, pp. 124-135. · Zbl 1245.68143 · doi:10.4230/LIPIcs.STACS.2012.124
[10] Diaconis, P. and Saloff-Coste, L. (1996). Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Prob. 6, 695-750. · Zbl 0867.60043 · doi:10.1214/aoap/1034968224
[11] Diaconis, P. and Saloff-Coste, L. (1996). Nash inequalities for finite Markov chains. J. Theoret. Prob. 9, 459-510. · Zbl 0870.60064 · doi:10.1007/BF02214660
[12] Dinwoodie, I. H. (1995). A probability inequality for the occupation measure of a reversible Markov chain. Ann. Appl. Prob. 5, 37-43. · Zbl 0829.60022 · doi:10.1214/aoap/1177004826
[13] Dinwoodie, I. H. (1998). Expectations for nonreversible Markov chains. J. Math. Anal. Appl. 220, 585-596. · Zbl 0946.60067 · doi:10.1006/jmaa.1997.5850
[14] Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Statist. 27, 642-669. · Zbl 0073.14603 · doi:10.1214/aoms/1177728174
[15] Fill, J. A. (1991). Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process. Ann. Appl. Prob. 1, 62-87. · Zbl 0726.60069 · doi:10.1214/aoap/1177005981
[16] Gillman, D. (1998). A Chernoff bound for random walks on expander graphs. SIAM J. Comput. 27, 1203-1220. · Zbl 0911.60016 · doi:10.1137/S0097539794268765
[17] Hsu, D., Kakade, S. M. and Zhang, T. (2009). A spectral algorithm for learning hidden Markov models. In Proc. 22nd Annual Conf. Learning Theory (Montreal, June 2009), 10 pp. · Zbl 1244.68065 · doi:10.1016/j.jcss.2011.12.025
[18] Kahale, N. (1997). Large deviation bounds for Markov chains. Combin. Prob. Comput. 6, 465-474. · Zbl 0892.60080 · doi:10.1017/S0963548397003209
[19] Kontorovich, A. (2012). Obtaining measure concentration from Markov contraction. Markov Process. Relat. Fields 18, 613-638. · Zbl 1286.60073
[20] Kontorovich, A., Nadler, B. and Weiss, R. (2013). On learning parametric-output HMMS. In Proc. 30th Internat. Conf. Machine Learning (June 2013, Atlanta), pp. 702-710.
[21] Kontorovich, L. and Ramanan, K. (2008). Concentration inequalities for dependent random variables via the martingale method. Ann. Prob. 36, 2126-2158. · Zbl 1154.60310 · doi:10.1214/07-AOP384
[22] Kontoyiannis, I. and Meyn, S. P. (2012). Geometric ergodicity and the spectral gap of non-reversible Markov chains. Prob. Theory Relat. Fields 154, 327-339. · Zbl 1263.60064 · doi:10.1007/s00440-011-0373-4
[23] León, C. A. and Perron, F. (2004). Optimal Hoeffding bounds for discrete reversible Markov chains. Ann. Appl. Prob. 14, 958-970. · Zbl 1056.60070 · doi:10.1214/105051604000000170
[24] Lezaud, P. (1998). Chernoff-type bound for finite Markov chains. Ann. Appl. Prob. 8, 849-867. · Zbl 0938.60027 · doi:10.1214/aoap/1028903453
[25] Markov, A. A. (1906). Extension of the law of large numbers to dependent quantities. Izvestiia Fiz.-Matem. Obsch. Kazan Univ. 15, 135-156.
[26] Marton, K. (1996). Bounding \(\bar{d}\)-distance by informational divergence: a method to prove measure concentration. Ann. Prob. 24, 857-866. · Zbl 0865.60017 · doi:10.1214/aop/1039639365
[27] Marton, K. (1998). Measure concentration for a class of random processes. Prob. Theory Relat. Fields 110, 427-439. · Zbl 0927.60050 · doi:10.1007/s004400050154
[28] Marton, K. (2003). Measure concentration and strong mixing. Studia Sci. Math. Hungarica 40, 95-113. · Zbl 1027.60011 · doi:10.1556/SScMath.40.2003.1-2.8
[29] Marton, K. (2004). Measure concentration for Euclidean distance in the case of dependent random variables. Ann. Prob. 32, 2526-2544. · Zbl 1071.60012 · doi:10.1214/009117904000000702
[30] Massart, P. (1990). The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality. Ann. Prob. 18, 1269-1283. · Zbl 0713.62021 · doi:10.1214/aop/1176990746
[31] Mossel, E. and Roch, S. (2006). Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Prob. 16, 583-614. · Zbl 1137.60034 · doi:10.1214/105051606000000024
[32] Rio, E. (2000). Inégalités de Hoeffding pour les fonctions lipschitziennes de suites dépendantes. C. R. Acad. Sci. Paris Sér. I Math. 330, 905-908. · Zbl 0961.60032 · doi:10.1016/S0764-4442(00)00290-1
[33] Samson, P.-M. (2000). Concentration of measure inequalities for Markov chains and \(\Phi\)-mixing processes. Ann. Prob. 28, 416-461. · Zbl 1044.60061 · doi:10.1214/aop/1019160125
[34] Siddiqi, S., Boots, B. and Gordon, G. (2010). Reduced-rank hidden Markov models. In Proc. 13th Internat Conf. Artificial Intelligence Statist. (Sardinia, Italy, May 2010), pp. 741-748.
[35] Wagner, R. (2008). Tail estimates for sums of variables sampled by a random walk. Combin. Prob. Comput. 17, 307-316. · Zbl 1144.60308 · doi:10.1017/S0963548307008772
[36] Zou, J. Y., Hsu, D., Parkes, D. and Adams, R. P. (2013). Contrastive learning using spectral methods. In Advances in Neural Information Processing Systems 26 , eds C. J. C. Burges et al. , pp. 2238-2246.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.