×

Safe adaptive importance sampling: a mixture approach. (English) Zbl 1472.62016

Adaptive importance sampling (AIS) constitutes new samples, such as particles in statistical physics, generated under certain probability distribution called policy \(q_k\) and the next policy \(q_{k+1}\) uses the new particles adaptively. In the earlier works, the policy is chosen as the kernel density estimate based on the previous particles reweighted by their importance weights. The authors propose a new approach called ‘safe adaptive importance sampling’ (SAIS) which estimates the policy as a mixture of kernel density estimate and certain ‘safe’ density with heavier tails. They also consider the functional approximation and derive convergence rates, leading to a central limit theorem for the estimates. It is observed that the asymptotic variance with this procedure is the same as that of an ‘oracle’ procedure. Further, a subsampling approach can be adopted to reduce the computational time involved without loosing the original efficiency. A simulation study at the end illustrates the practical nature of the algorithms developed. A section at the end gives detailed mathematical proofs including two appendices. There is a rich list of useful references.

MSC:

62D05 Sampling theory, sample surveys
62G07 Density estimation
65C05 Monte Carlo methods
60F05 Central limit and other weak theorems
60G44 Martingales with continuous parameter

References:

[1] Azaïs, R., Delyon, B. and Portier, F. (2018). Integral estimation based on Markovian design. Adv. in Appl. Probab. 50 833-857. · Zbl 1436.62397 · doi:10.1017/apr.2018.38
[2] Bertail, P. and Portier, F. (2019). Rademacher complexity for Markov chains: Applications to kernel smoothing and Metropolis-Hastings. Bernoulli 25 3912-3938. · Zbl 1431.60080 · doi:10.3150/19-bej1115
[3] Bornn, L., Jacob, P. E., Del Moral, P. and Doucet, A. (2013). An adaptive interacting Wang-Landau algorithm for automatic density exploration. J. Comput. Graph. Statist. 22 749-773. · doi:10.1080/10618600.2012.723569
[4] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford. With a foreword by Michel Ledoux. · Zbl 1337.60003 · doi:10.1093/acprof:oso/9780199535255.001.0001
[5] Cappé, O., Guillin, A., Marin, J. M. and Robert, C. P. (2004). Population Monte Carlo. J. Comput. Graph. Statist. 13 907-929. · doi:10.1198/106186004X12803
[6] Cappé, O., Douc, R., Guillin, A., Marin, J.-M. and Robert, C. P. (2008). Adaptive importance sampling in general mixture classes. Stat. Comput. 18 447-459. · doi:10.1007/s11222-008-9059-x
[7] Chopin, N. (2004). Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference. Ann. Statist. 32 2385-2411. · Zbl 1079.65006 · doi:10.1214/009053604000000698
[8] Dai, B., He, N., Dai, H. and Song, L. (2016). Provable Bayesian inference via particle mirror descent. In Artificial Intelligence and Statistics 985-994.
[9] Del Moral, P. (2013). Mean Field Simulation for Monte Carlo Integration. Monographs on Statistics and Applied Probability 126. CRC Press, Boca Raton, FL.
[10] Delyon, B. and Portier, F. (2018). Asymptotic optimality of adaptive importance sampling. In Proceedings of the 32nd International Conference on Neural Information Processing Systems 3138-3148.
[11] Douc, R., Guillin, A., Marin, J.-M. and Robert, C. P. (2007a). Convergence of adaptive mixtures of importance sampling schemes. Ann. Statist. 35 420-448. · Zbl 1132.60022 · doi:10.1214/009053606000001154
[12] Douc, R., Guillin, A., Marin, J.-M. and Robert, C. P. (2007b). Minimum variance importance sampling via population Monte Carlo. ESAIM Probab. Stat. 11 427-447. · Zbl 1181.60028 · doi:10.1051/ps:2007028
[13] Elvira, V., Martino, L., Luengo, D. and Bugallo, M. F. (2019). Generalized multiple importance sampling. Statist. Sci. 34 129-155. · Zbl 1420.62038 · doi:10.1214/18-STS668
[14] Evans, M. and Swartz, T. (2000). Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford Statistical Science Series. Oxford Univ. Press, Oxford. · Zbl 0958.65009
[15] Feng, M. B., Maggiar, A., Staum, J. and Wächter, A. (2018). Uniform convergence of sample average approximation with adaptive multiple importance sampling. In 2018 Winter Simulation Conference (WSC) 1646-1657. IEEE.
[16] Folland, G. B. (2013). Real Analysis: Modern Techniques and Their Applications. Wiley, New York.
[17] Fort, G., Jourdain, B., Kuhn, E., Lelièvre, T. and Stoltz, G. (2015). Convergence of the Wang-Landau algorithm. Math. Comp. 84 2297-2327. · Zbl 1317.65011 · doi:10.1090/S0025-5718-2015-02952-4
[18] Freedman, D. A. (1975). On tail probabilities for martingales. Ann. Probab. 3 100-118. · Zbl 0313.60037 · doi:10.1214/aop/1176996452
[19] Geweke, J. (1989). Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57 1317-1339. · Zbl 0683.62068 · doi:10.2307/1913710
[20] Giné, E. and Guillou, A. (2001). On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals. Ann. Inst. Henri Poincaré Probab. Stat. 37 503-522. · Zbl 0974.62030 · doi:10.1016/S0246-0203(01)01081-0
[21] Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907-921. · Zbl 1011.62034 · doi:10.1016/S0246-0203(02)01128-7
[22] Givens, G. H. and Raftery, A. E. (1996). Local adaptive importance sampling for multivariate densities with strong nonlinear relationships. J. Amer. Statist. Assoc. 91 132-141. · Zbl 0869.62025 · doi:10.2307/2291389
[23] Gordon, N. J., Salmond, D. J. and Smith, A. F. (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc., F, Radar Signal Process. 140 107-113.
[24] Haario, H., Saksman, E. and Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli 7 223-242. · Zbl 0989.65004
[25] Hall, P. and Heyde, C. C. (1980). Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York. · Zbl 0462.60045
[26] Hansen, B. E. (2008). Uniform convergence rates for kernel estimation with dependent data. Econometric Theory 24 726-748. · Zbl 1284.62252 · doi:10.1017/S0266466608080304
[27] Hesterberg, T. (1995). Weighted average importance sampling and defensive mixture distributions. Technometrics 37 185-194. · Zbl 0822.62002
[28] Kloek, T. and Van Dijk, H. K. (1978). Bayesian estimates of equation system parameters: An application of integration by Monte Carlo. Econometrica 1-19. · Zbl 0376.62014
[29] Marin, J.-M., Pudlo, P. and Sedki, M. (2019). Consistency of adaptive importance sampling and recycling schemes. Bernoulli 25 1977-1998. · Zbl 1466.62157 · doi:10.3150/18-BEJ1042
[30] Neddermeyer, J. C. (2009). Computationally efficient nonparametric importance sampling. J. Amer. Statist. Assoc. 104 788-802. · Zbl 1388.62015 · doi:10.1198/jasa.2009.0122
[31] Oh, M.-S. and Berger, J. O. (1992). Adaptive importance sampling in Monte Carlo integration. J. Stat. Comput. Simul. 41 143-168. · Zbl 0781.65016 · doi:10.1080/00949659208810398
[32] Owen, A. B. (2013). Monte Carlo Theory, Methods and Examples.
[33] Owen, A. and Zhou, Y. (2000). Safe and effective importance sampling. J. Amer. Statist. Assoc. 95 135-143. · Zbl 0998.65003 · doi:10.2307/2669533
[34] Pollard, D. (1984). Convergence of Stochastic Processes. Springer Series in Statistics. Springer, New York. · Zbl 0544.60045 · doi:10.1007/978-1-4612-5254-2
[35] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer Texts in Statistics. Springer, New York. · Zbl 1096.62003 · doi:10.1007/978-1-4757-4145-2
[36] Silverman, B. W. (2018). Density Estimation for Statistics and Data Analysis. Routledge, London.
[37] Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. Ann. Statist. 8 1348-1360. · Zbl 0451.62033
[38] Temlyakov, V. N. (1998). The best \(m\)-term approximation and greedy algorithms. Adv. Comput. Math. 8 249-265. · Zbl 0905.65063 · doi:10.1023/A:1018900431309
[39] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge. · Zbl 0943.62002 · doi:10.1017/CBO9780511802256
[40] Wang, F. and Landau, D. P. (2001). Determining the density of states for classical statistical models: A random walk algorithm to produce a flat histogram. Phys. Rev. E 64 056101.
[41] West, M. (1993). Approximating posterior distributions by mixtures. J. Roy. Statist. Soc. Ser. B 55 409-422. · Zbl 0800.62221
[42] Zhang, P. (1996). Nonparametric importance sampling. J. Amer. Statist. Assoc. 91 1245-1253 · Zbl 0880.62044 · doi:10.2307/2291743
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.