×

On the geometric ergodicity of Hamiltonian Monte Carlo. (English) Zbl 1428.62375

Summary: We establish general conditions under which Markov chains produced by the Hamiltonian Monte Carlo method will and will not be geometrically ergodic. We consider implementations with both position-independent and position-dependent integration times. In the former case, we find that the conditions for geometric ergodicity are essentially a gradient of the log-density which asymptotically points towards the centre of the space and grows no faster than linearly. In an idealised scenario in which the integration time is allowed to change in different regions of the space, we show that geometric ergodicity can be recovered for a much broader class of tail behaviours, leading to some guidelines for the choice of this free parameter in practice.

MSC:

60J22 Computational methods in Markov chains
65C05 Monte Carlo methods

Software:

NUTS; Stan

References:

[1] Alder, B.J. and Wainwright, T.E. (1959). Studies in molecular dynamics. I. General method. J. Chem. Phys.31 459-466.
[2] Andrieu, C., De Freitas, N., Doucet, A. and Jordan, M.I. (2003). An introduction to MCMC for machine learning. Mach. Learn.50 5-43. · Zbl 1033.68081 · doi:10.1023/A:1020281327116
[3] Beskos, A., Pillai, N., Roberts, G., Sanz-Serna, J.-M. and Stuart, A. (2013). Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli19 1501-1534. · Zbl 1287.60090 · doi:10.3150/12-BEJ414
[4] Beskos, A., Pinski, F.J., Sanz-Serna, J.M. and Stuart, A.M. (2011). Hybrid Monte Carlo on Hilbert spaces. Stochastic Process. Appl.121 2201-2230. · Zbl 1235.60091 · doi:10.1016/j.spa.2011.06.003
[5] Betancourt, M. (2013). A general metric for Riemannian manifold Hamiltonian Monte Carlo. In Geometric Science of Information. Lecture Notes in Computer Science8085 327-334. Heidelberg: Springer. · Zbl 1406.65004
[6] Betancourt, M. (2016). Identifying the Optimal Integration Time in Hamiltonian Monte Carlo. Preprint. Available at arXiv:1601.00225.
[7] Betancourt, M., Byrne, S., Livingstone, S. and Girolami, M. (2017). The geometric foundations of Hamiltonian Monte Carlo. Bernoulli23 2257-2298. · Zbl 1380.60070 · doi:10.3150/16-BEJ810
[8] Bou-Rabee, N. and Hairer, M. (2013). Nonasymptotic mixing of the MALA algorithm. IMA J. Numer. Anal.33 80-110. · Zbl 1305.65012 · doi:10.1093/imanum/drs003
[9] Bou-Rabee, N. and Sanz-Serna, J.M. (2017). Randomized Hamiltonian Monte Carlo. Ann. Appl. Probab.27 2159-2194. Available at arXiv:1511.09382. · Zbl 1373.60129 · doi:10.1214/16-AAP1255
[10] Bou-Rabee, N. and Sanz-Serna, J.M. (2018). Geometric integrators and the Hamiltonian Monte Carlo method. Acta Numer.27 113-206. · Zbl 1431.65004 · doi:10.1017/S0962492917000101
[11] Byrne, S. and Girolami, M. (2013). Geodesic Monte Carlo on embedded manifolds. Scand. J. Stat.40 825-845. · Zbl 1349.62186 · doi:10.1111/sjos.12036
[12] Campos, C.M. and Sanz-Serna, J.M. (2015). Extra chance generalized hybrid Monte Carlo. J. Comput. Phys.281 365-374. · Zbl 1352.65007 · doi:10.1016/j.jcp.2014.09.037
[13] Cancès, E., Legoll, F. and Stoltz, G. (2007). Theoretical and numerical comparison of some sampling methods for molecular dynamics. ESAIM Math. Model. Numer. Anal.41 351-389. · Zbl 1138.82341 · doi:10.1051/m2an:2007014
[14] Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M.A., Guo, J., Li, P. and Riddell, A. (2016). Stan: A probabilistic programming language. J. Stat. Softw..
[15] Diaconis, P. (2013). Some things we’ve learned (about Markov chain Monte Carlo). Bernoulli19 1294-1305. · Zbl 1412.60109 · doi:10.3150/12-BEJSP09
[16] Diaconis, P. and Freedman, D. (1999). Iterated random functions. SIAM Rev.41 45-76. · Zbl 0926.60056 · doi:10.1137/S0036144598338446
[17] Diaconis, P., Seiler, C. and Holmes, S. (2014). Connections and extensions: A discussion of the paper by Girolami and Byrne [MR3145120]. Scand. J. Stat.41 3-7. · Zbl 1349.62192 · doi:10.1111/sjos.12070
[18] Duane, S., Kennedy, A.D., Pendleton, B.J. and Roweth, D. (1987). Hybrid Monte Carlo. Phys. Lett. B195 216-222.
[19] Durmus, A. and Moulines, É. (2015). Quantitative bounds of convergence for geometrically ergodic Markov chain in the Wasserstein distance with application to the Metropolis adjusted Langevin algorithm. Stat. Comput.25 5-19. · Zbl 1332.62282
[20] Durmus, A., Moulines, E. and Saksman, E. (2017). On the convergence of Hamiltonian Monte Carlo. Preprint. Available at arXiv:1705.00166. · Zbl 1460.65005
[21] Eberle, A. (2014). Error bounds for Metropolis-Hastings algorithms applied to perturbations of Gaussian measures in high dimensions. Ann. Appl. Probab.24 337-377. · Zbl 1296.60195 · doi:10.1214/13-AAP926
[22] Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (2014). Bayesian Data Analysis2. Taylor & Francis.
[23] Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol.73 123-214. With discussion and a reply by the authors. · Zbl 1411.62071 · doi:10.1111/j.1467-9868.2010.00765.x
[24] Goldstein, H. (1965). Classical Mechanics. Pearson Education India. · Zbl 0043.18001
[25] Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika57 97-109. · Zbl 0219.65008 · doi:10.1093/biomet/57.1.97
[26] Hoffman, M.D. and Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res.15 1593-1623. · Zbl 1319.60150
[27] Horowitz, A.M. (1991). A generalized guided Monte Carlo algorithm. Phys. Lett. B268 247-252.
[28] Jarner, S.F. and Hansen, E. (2000). Geometric ergodicity of Metropolis algorithms. Stochastic Process. Appl.85 341-361. · Zbl 0997.60070 · doi:10.1016/S0304-4149(99)00082-4
[29] Jarner, S.F. and Tweedie, R.L. (2003). Necessary conditions for geometric and polynomial ergodicity of random-walk-type Markov chains. Bernoulli9 559-578. · Zbl 1043.60054 · doi:10.3150/bj/1066223269
[30] Jones, G.L. and Hobert, J.P. (2001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statist. Sci.16 312-334. · Zbl 1127.60309 · doi:10.1214/ss/1015346317
[31] Lee, J.M. (2012). Introduction to Smooth Manifolds. Springer.
[32] Leimkuhler, B. and Reich, S. (2004). Simulating Hamiltonian Dynamics. Cambridge Monographs on Applied and Computational Mathematics14. Cambridge: Cambridge Univ. Press. · Zbl 1069.65139
[33] Livingstone, S., Betancourt, M., Byrne, S. and Girolami, M. (2018). Supplement to “On the geometric ergodicity of Hamiltonian Monte Carlo.” DOI:10.3150/18-BEJ1083SUPP. · Zbl 1428.62375
[34] Livingstone, S. and Girolami, M. (2014). Information-geometric Markov chain Monte Carlo methods using diffusions. Entropy16 3074-3102. · Zbl 1338.65018 · doi:10.3390/e16063074
[35] Mattingly, J.C., Stuart, A.M. and Higham, D.J. (2002). Ergodicity for SDEs and approximations: Locally Lipschitz vector fields and degenerate noise. Stochastic Process. Appl.101 185-232. · Zbl 1075.60072 · doi:10.1016/S0304-4149(02)00150-3
[36] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys.21 1087-1092. · Zbl 1431.65006
[37] Meyn, S.P. and Tweedie, R.L. (2012). Markov Chains and Stochastic Stability. Springer Science & Business Media. · Zbl 0925.60001
[38] Neal, R.M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Handb. Mod. Stat. Methods 113-162. Boca Raton, FL: CRC Press. · Zbl 1229.65018
[39] Ollivier, Y. (2009). Ricci curvature of Markov chains on metric spaces. J. Funct. Anal.256 810-864. · Zbl 1181.53015 · doi:10.1016/j.jfa.2008.11.001
[40] Ottobre, M., Pillai, N.S., Pinski, F.J. and Stuart, A.M. (2016). A function space HMC algorithm with second order Langevin diffusion limit. Bernoulli22 60-106. · Zbl 1346.60119 · doi:10.3150/14-BEJ621
[41] Roberts, G.O. and Rosenthal, J.S. (1997). Geometric ergodicity and hybrid Markov chains. Electron. Commun. Probab.2 13-25. · Zbl 0890.60061 · doi:10.1214/ECP.v2-981
[42] Roberts, G.O. and Rosenthal, J.S. (2004). General state space Markov chains and MCMC algorithms. Probab. Surv.1 20-71. · Zbl 1189.60131 · doi:10.1214/154957804100000024
[43] Roberts, G.O. and Tweedie, R.L. (1996). Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika83 95-110. · Zbl 0888.60064 · doi:10.1093/biomet/83.1.95
[44] Roberts, G.O. and Tweedie, R.L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli2 341-363. · Zbl 0870.60027 · doi:10.2307/3318418
[45] Seiler, C., Rubinstein-Salzedo, S. and Holmes, S. (2014). Positive curvature and Hamiltonian Monte Carlo. In Advances in Neural Information Processing Systems 586-594.
[46] Stuart, A.M. (2010). Inverse problems: A Bayesian perspective. Acta Numer.19 451-559. · Zbl 1242.65142 · doi:10.1017/S0962492910000061
[47] Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist.22 1701-1762. With discussion and a rejoinder by the author. · Zbl 0829.62080 · doi:10.1214/aos/1176325750
[48] Tierney, L. (1998). A note on Metropolis-Hastings kernels for general state spaces. Ann. Appl. Probab.8 1-9. · Zbl 0935.60053 · doi:10.1214/aoap/1027961031
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.