Abstract
Sampling from various kinds of distributions is an issue of paramount importance in statistics since it is often the key ingredient for constructing estimators, test procedures or confidence intervals. In many situations, the exact sampling from a given distribution is impossible or computationally expensive and, therefore, one needs to resort to approximate sampling strategies. However, it is only very recently that a mathematical theory providing non-asymptotic guarantees for approximate sampling problem in the high-dimensional settings started to be developed. In this paper we introduce a new mathematical framework that helps to analyze the Stochastic Gradient Descent as a method of sampling, closely related to Langevin Monte-Carlo.
Similar content being viewed by others
References
D. Bakry, P. Cattiaux, and A. Guillin, “Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincare”, Journal of Functional Analysis, 254 (3), 727–759, 2008.
L. Bottou, F. E. Curtis, and J. Nocedal, “Optimization methods for large–scale machine learning”, SIAM Review, 60 (2), 223–311, 2018.
N. Brosse et al. “Sampling from a log–concave distribution with compact support with proximal Langevin Monte Carlo”, Proceedings of Machine Learning Research, 65, 319–342, 2017.
N. S. Chatterji et al. “On the Theory of Variance Reduction for Stochastic GradientMonte Carlo”, In: arXiv preprint arXiv:1802.05431, 2018.
A. S. Dalalyan, “Theoretical guarantees for approximate sampling from smooth and log–concave densities”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79 (3), 651–676, 2017.
A. S. Dalalyan, A. G. Karagulyan, “User–friendly guarantees for the Langevin Monte Carlo with inaccurate gradient”, arXiv preprint arXiv:1710.00095, 2017.
A. Defazio, F. R. Bach, and S. Lacoste–Julien, “SAGA: A Fast Incremental GradientMethodWith Support for Non–Strongly Convex Composite Objectives”, CoRR abs/1407.0202, arXiv: 1407.0202, 2014.
A. Durmus and E. Moulines, “High–dimensional Bayesian inference via the Unadjusted Langevin Algorithm”, arXiv preprint arXiv:1605.01559, 2016.
A. Durmus and E. Moulines et al., “Nonasymptotic convergence analysis for the unadjusted Langevin algorithm”, The Annals of Applied Probability, 27 (3), 1551–1587, 2017.
S. B. Gelfand and S. K. Mitter, “Recursive stochastic algorithms for global optimization in Rd”, SIAM Journal on Control and Optimization, 29 (5), 999–1018, 1991.
S. F. Jarner and E. Hansen, “Geometric ergodicity ofMetropolis algorithms”, Stochastic Process. Appl., 85 (2), 341–361, 2000.
N. S. Pillai, A. M. Stuart, and A. H. Thiery, “Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions”, Ann. Appl. Probab., 22 (6), 2320–2356, 2012.
G. O. Roberts and J. S. Rosenthal, “Optimal scaling of discrete approximations to Langevin diffusions”, J. R. Stat. Soc. Ser. B, Stat. Methodol., 60 (1), 255–268, 1998.
G. O. Roberts and O. Stramer, “Langevin diffusions and Metropolis–Hastings algorithms”, Methodol. Comput. Appl. Probab., 4 (4), 337–357, 2003.
G. O. Roberts, R. L. Tweedie et al, “Exponential convergence of Langevin distributions and their discrete approximations”, Bernoulli, 2 (4), 341–363, 1996.
O. Stramer and R. L. Tweedie, “Langevin–type models. I. Diffusions with given stationary distributions and their discretizations”, Methodol. Comput. Appl. Probab., 1 (3), 283–306, 1999.
Author information
Authors and Affiliations
Corresponding author
Additional information
Russian Text © A. G. Karagulyan, 2019, published in Izvestiya Natsional’noi Akademii Nauk Armenii, Matematika, 2019, No. 2, pp. 43–53.
About this article
Cite this article
Karagulyan, A.G. Non-Asymptotic Guarantees for Sampling by Stochastic Gradient Descent. J. Contemp. Mathemat. Anal. 54, 71–78 (2019). https://doi.org/10.3103/S1068362319020031
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1068362319020031