×

Spectral gap of replica exchange Langevin diffusion on mixture distributions. (English) Zbl 1492.60218

Summary: Langevin diffusion (LD) is one of the main workhorses for sampling problems. However, its convergence rate can be significantly reduced if the target distribution is a mixture of multiple densities, especially when each component density concentrates around a different mode. Replica exchange Langevin diffusion (ReLD) is a sampling method that can circumvent this issue. This approach can be further extended to multiple replica exchange Langevin diffusion (mReLD). While ReLD and mReLD have been used extensively in statistics, molecular dynamics, and other applications, there is limited existing analysis on its convergence rate and choices of the temperatures. This paper addresses these problems assuming the target distribution is a mixture of log-concave densities. We show ReLD can obtain constant or better convergence rates. We also show mReLD with \(K\) additional LDs can achieve the same results while the exchange frequency only needs to be \((1/K)\)-th power of the one in ReLD.

MSC:

60J22 Computational methods in Markov chains
65C05 Monte Carlo methods
65C40 Numerical analysis or methods applied to Markov chains

Software:

Gromacs

References:

[1] Abraham, M. J.; Gready, J. E., Ensuring mixing efficiency of replica-exchange molecular dynamics simulations, J. Chem. Theory Comput., 4, 7, 1119-1128 (2008)
[2] Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E., GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX, 1, 19-25 (2015)
[3] Aristoff, D.; Lelievre, T., Mathematical analysis of temperature accelerated dynamics, Multiscale Model. Simul., 12, 1, 290-317 (2014) · Zbl 1326.82018
[4] Bakry, D.; Barthe, F.; Cattiaux, P.; Guillin, A., A simple proof of the Poincaré inequality for a large class of probability measures, Electron. Commun. Probab., 13, 60-66 (2008) · Zbl 1186.26011
[5] Bakry, D.; Gentil, I.; Ledoux, M., Analysis and Geometry of Markov Diffusion Operators, Vol. 348 (2013), Springer Science & Business Media
[6] Bebendorf, M., A note on the poincaré inequality for convex domains, Zeitschrift Für Analysis Und Ihre Anwendungen, 22, 4, 751-756 (2003) · Zbl 1057.26011
[7] Y. Chen, J. Chen, J. Dong, J. Peng, Z. Wang, Accelerating nonconvex learning via replica exchange Langevin diffusion, in: International Conference on Learning Representations, 2019.
[8] Cho, K.; Raiko, T.; Ilin, A., Parallel tempering is efficient for learning restricted Boltzmann machines, (The 2010 International Joint Conference on Neural Networks. The 2010 International Joint Conference on Neural Networks, Ijcnn (2010), IEEE), 1-8
[9] Chodera, J. D.; Shirts, M. R., Replica exchange and expanded ensemble simulations as Gibbs sampling: Simple improvements for enhanced mixing, J. Chem. Phys., 135, 19, Article 194110 pp. (2011)
[10] Cui, T.; Martin, J.; Marzouk, Y. M.; Solonen, A.; Spantini, A., Likelihood-informed dimension reduction for nonlinear inverse problems, Inverse Problems, 30, 11, Article 114015 pp. (2014) · Zbl 1310.62030
[11] Desjardins, G.; Courville, A.; Bengio, Y.; Vincent, P.; Delalleau, O., Parallel tempering for training of restricted Boltzmann machines, (Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (2010), MIT Press: MIT Press Cambridge, MA), 145-152
[12] Doll, J.; Dupuis, P.; Nyquist, P., A large deviations analysis of certain qualitative properties of parallel tempering and infinite swapping algorithms, Appl. Math. Optim., 78, 1, 103-144 (2018) · Zbl 1397.65015
[13] Dong, J.; Tong, X. T., Replica exchange for non-convex optimization, J. Mach. Learn. Res., 22, 173, 1-59 (2021) · Zbl 07415116
[14] Dupuis, P.; Liu, Y.; Plattner, N.; Doll, J., On the infinite swapping limit for parallel tempering, Multiscale Model. Simul., 10, 3, 986-1022 (2012) · Zbl 1261.65009
[15] Earl, D. J.; Deem, M. W., Parallel tempering: Theory, applications, and new perspectives, Phys. Chem. Chem. Phys., 7, 23, 3910-3916 (2005)
[16] Ebbers, M.; Knöpfel, H.; Löwe, M.; Vermet, F., Mixing times for the swapping algorithm on the Blume-Emery-Griffiths model, Random Struct. Algorithms, 45, 1, 38-77 (2014) · Zbl 1305.60064
[17] Ge, R.; Lee, H.; Risteski, A., Simulated tempering langevin Monte Carlo II: An improved proof using soft Markov chain decomposition (2018), arXiv preprint arXiv:1812.00793
[18] Geyer, C. J.; Thompson, E. A., Annealing Markov chain Monte Carlo with applications to ancestral inference, J. Amer. Statist. Assoc., 90, 431, 909-920 (1995) · Zbl 0850.62834
[19] Hairer, M.; Stuart, A.; Vollmer, S., Spectral gaps for a Metropolis-Hastings algorithm in infinite dimensions, Ann. Appl. Probab., 24, 6, 2455-2490 (2014) · Zbl 1307.65002
[20] Hult, H.; Nyquist, P.; Ringqvist, C., Infinite swapping algorithm for training restricted Boltzmann machines, (International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing (2018), Springer), 285-307 · Zbl 07240100
[21] Kofke, D. A., On the acceptance probability of replica-exchange Monte Carlo trials, J. Chem. Phys., 117, 15, 6911-6914 (2002)
[22] Lelievre, T.; Stoltz, G., Partial differential equations and stochastic methods in molecular dynamics, Acta Numer., 25, 681-880 (2016) · Zbl 1348.82065
[23] Lindvall, T., Lectures on the Coupling Method (2002), Courier Corporation · Zbl 1013.60001
[24] Madras, N.; Randall, D., Markov chain decomposition for convergence rate analysis, Ann. Appl. Probab., 581-606 (2002) · Zbl 1017.60080
[25] Marinari, E.; Parisi, G., Simulated tempering: A new Monte Carloscheme, Europhys. Lett., 19, 6, 451 (1992)
[26] Menz, G.; Schlichting, A., Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape, Ann. Probab., 42, 5, 1809-1884 (2014) · Zbl 1327.60156
[27] Menz, G.; Schlichting, A.; Tang, W.; Wu, T., Ergodicity of the infinite swapping algorithm at low temperature (2021), arXiv preprint arXiv:1811.10174
[28] Morzfeld, M.; Tong, X. T.; Marzouk, Y. M., Localization for MCMC: sampling high-dimensional posterior distributions with local structure, J. Comput. Phys., 380, 1-28 (2019) · Zbl 1451.65011
[29] Neal, R. M., Sampling from multimodal distributions using tempered transitions, Stat. Comput., 6, 4, 353-366 (1996)
[30] Roberts, G. O.; Rosenthal, J. S., Minimising MCMC variance via diffusion limits, with an application to simulated temperingmcmc variance via diffusion limits, with an application to simulated tempering, Ann. Appl. Probab., 24, 1, 131-149 (2014) · Zbl 1298.60078
[31] Sindhikara, D. J.; Emerson, D. J.; Roitberg, A. E., Exchange often and properly in replica exchange molecular dynamics, J. Chem. Theory Comput., 6, 9, 2804-2808 (2010)
[32] Sugita, Y.; Okamoto, Y., Replica-exchange molecular dynamics method for protein folding, Chem. Phys. Lett., 314, 1-2, 141-151 (1999)
[33] Swendsen, R. H.; Wang, J.-S., Replica Monte Carlo simulation of spin-glasses, Phys. Rev. Lett., 57, 21, 2607 (1986)
[34] Syed, S.; Bouchard-Côté, A.; Deligiannidis, G.; Doucet, A., Non-reversible parallel tempering: A scalable highly parallel mcmc scheme (2019), arXiv preprint arXiv:1905.02939
[35] Tawn, N. G.; Roberts, G. O., Accelerating parallel tempering: Quantile tempering algorithm (QuanTA), Adv. Appl. Probab., 51, 3, 802-834 (2019) · Zbl 1429.65012
[36] Tawn, N. G.; Roberts, G. O.; Rosenthal, J. S., Weight-preserving simulated tempering, Stat. Comput., 30, 1, 27-41 (2020) · Zbl 1431.60082
[37] Tong, X. T.; Morzfeld, M.; Marzouk, Y. M., MALA-within-gibbs samplers for high-dimensional distributions with sparse conditional structure, SIAM J. Sci. Comput., 42, 3, A1765-A1788 (2020) · Zbl 1444.62073
[38] Woodard, D. B.; Schmidler, S. C.; Huber, M., Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions, Ann. Appl. Probab., 19, 2, 617-640 (2009) · Zbl 1171.65008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.