×

Adaptive approximate Bayesian computation tolerance selection. (English) Zbl 1480.62165

Summary: Approximate Bayesian Computation (ABC) methods are increasingly used for inference in situations in which the likelihood function is either computationally costly or intractable to evaluate. Extensions of the basic ABC rejection algorithm have improved the computational efficiency of the procedure and broadened its applicability. The ABC-Population Monte Carlo (ABC-PMC) approach has become a popular choice for approximate sampling from the posterior. ABC-PMC is a sequential sampler with an iteratively decreasing value of the tolerance, which specifies how close the simulated data need to be to the real data for acceptance. We propose a method for adaptively selecting a sequence of tolerances that improves the computational efficiency of the algorithm over other common techniques. In addition we define a stopping rule as a by-product of the adaptation procedure, which assists in automating termination of sampling. The proposed automatic ABC-PMC algorithm can be easily implemented and we present several examples demonstrating its benefits in terms of computational efficiency.

MSC:

62L12 Sequential estimation
62F15 Bayesian inference
65C05 Monte Carlo methods

Software:

R; BRENT; astroABC; abc; cosmoabc

References:

[1] Andrieu, C., De Freitas, N., Doucet, A., and Jordan, M. I. (2003). “An introduction to MCMC for machine learning.” Machine learning, 50(1-2): 5-43. · Zbl 1033.68081 · doi:10.1023/A:1020281327116
[2] Beaumont, M. A. (2010). “Approximate Bayesian computation in evolution and ecology.” Annual review of ecology, evolution, and systematics 41, 96: 379-406. · doi:10.1146/annurev-statistics-030718-105212
[3] Beaumont, M. A., Cornuet, J.-M., Marin, J.-M., and Robert, C. P. (2009). “Adaptive approximate Bayesian computation.” Biometrika, 96(4): 983-990. · Zbl 1437.62393 · doi:10.1093/biomet/asp052
[4] Bickel, S., Brückner, M., and Scheffer, T. (2007). “Discriminative learning for differing training and test distributions.” In Proceedings of the 24th international conference on Machine learning, 81-88. ACM.
[5] Blum, M., Nunes, M., Prangle, D., and Sisson, S. (2013). “A comparative review of dimension reduction methods in approximate Bayesian computation.” Statistical Science, 28(2): 189-208. · Zbl 1331.62123 · doi:10.1214/12-STS406
[6] Blum, M. G. (2010). “Approximate Bayesian Computation: A nonparametric perspective.” Journal of American Statistical Association, 105(491): 1178-1187. · Zbl 1390.62052 · doi:10.1198/jasa.2010.tm09448
[7] Bonassi, F. and West, M. (2015). “Sequential Monte Carlo with Adaptive Weights for Approximate Bayesian Computation.” Bayesian Analysis, (10): 171-187. · Zbl 1335.62015 · doi:10.1214/14-BA891
[8] Bregman, L. M. (1967). “The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming.” USSR Computational Mathematics and Mathematical Physics, 7(3): 200-217. · Zbl 0186.23807
[9] Brent, R. P. (2013). Algorithms for minimization without derivatives. Courier Corporation. · Zbl 1009.90133
[10] Brooks-Pollock, E., Becerra, M. C., Goldstein, E., Cohen, T., and Murray, M. B. (2011). “Epidemiologic inference from the distribution of tuberculosis cases in households in Lima, Peru.” Journal of Infectious Diseases, 203(11): 1582-1589.
[11] Cameron, E. and Pettitt, A. N. (2012). “Approximate Bayesian Computation for Astronomical Model Analysis: A Case Study in Galaxy Demographics and Morphological Transformation at High Redshift.” Monthly Notices of the Royal Astronomical Society, 425: 44-65.
[12] Cisewski-Kehe, J., Weller, G., Schafer, C., et al. (2019). “A preferential attachment model for the stellar initial mass function.” Electronic Journal of Statistics, 13(1): 1580-1607. · Zbl 1423.85006 · doi:10.1214/19-EJS1556
[13] Corander, J., Fraser, C., Gutmann, M. U., Arnold, B., Hanage, W. P., Bentley, S. D., Lipsitch, M., and Croucher, N. J. (2017). “Frequency-dependent selection in vaccine-associated pneumococcal population dynamics.” Nature ecology & evolution, 1(12): 1950.
[14] Cornuet, J., Santos, F., Beaumont, M., Robert, C., Marin, J., Balding, D., Guillemaud, T., and Estoup, A. (2008). “Inferring population history with DIY ABC: a user-friendly approach to Approximate Bayesian Computation.” Bioinformatics. · Zbl 1437.62393 · doi:10.1093/biomet/asp052
[15] Csilléry, K., Blum, M. G., Gaggiotti, O. E., and François, O. (2010). “Approximate Bayesian Computation (ABC) in practice.” Trends in ecology & evolution, 25(7): 410-418.
[16] Del Moral, P., Doucet, A., and Jasra, A. (2012). “An adaptive sequential Monte Carlo method for approximate Bayesian computation.” Statistics and Computing, 22(5): 1009-1020. · Zbl 1252.65025 · doi:10.1007/s11222-011-9271-y
[17] Drovandi, C. C. and Pettitt, A. N. (2011). “Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics.” Statistics and Computing, 67(1): 225-233. · Zbl 1217.62128 · doi:10.1111/j.1541-0420.2010.01410.x
[18] Fearnhead, P. and Prangle, D. (2012). “Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation.” Journal of the Royal Statistical Society Series B, 74(3): 419-474. · Zbl 1411.62057 · doi:10.1111/j.1467-9868.2011.01010.x
[19] Gelman, A., Carln, J., Stern, H., Dunson, D., Vehtari, A., and Rubin, D. (2014). Bayesian Data Analysis. Chapman & Hall. · Zbl 1279.62004
[20] Gretton, A., Smola, A. J., Huang, J., Schmittfull, M., Borgwardt, K. M., and Schölkopf, B. (2009). “Covariate shift by kernel mean matching.”
[21] Gutmann, M. U. and Corander, J. (2016). “Bayesian optimization for likelihood-free inference of simulator-based statistical models.” The Journal of Machine Learning Research, 17(1): 4256-4302. · Zbl 1392.62072
[22] Hesterberg, T. C. (1988). “Advances in importance sampling.” Ph.D. thesis, Stanford University.
[23] Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., and Kanamori, T. (2011). “Statistical outlier detection using direct density ratio estimation.” Knowledge and information systems, 26(2): 309-336.
[24] Hoti, F., Erästö, P., Leino, T., and Auranen, K. (2009). “Outbreaks of Streptococcus pneumoniae carriage in day care cohorts in Finland-implications for elimination of transmission.” BMC infectious diseases, 9(1): 102.
[25] Ishida, E., Vitenti, S., Penna-Lima, M., Cisewski, J., de Souza, R., Trindade, A., Cameron, E., et al. (2015). “cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation.” Astronomy & Computing, 13: 1-11.
[26] Järvenpää, M., Gutmann, M., Vehtari, A., and Marttinen, P. (2016). “Gaussian process modeling in approximate Bayesian computation to estimate horizontal gene transfer in bacteria.” arXiv preprint arXiv:1610.06462. · Zbl 1411.62320 · doi:10.1214/18-AOAS1150
[27] Jennings, E. and Madigan, M. (2016). “astroABC: An Approximate Bayesian Computation Sequential Monte Carlo sampler for cosmological parameter estimation.” Astronomy and Computing.
[28] Jennings, E., Wolf, R., and Sako, M. (2016). “A new approach for obtaining cosmological constraints from type IA supernovae using approximate Bayesian computation.” Astronomy and Computing.
[29] Joyce, P. and Marjoram, P. (2008). “Approximately sufficient statistics and Bayesian computation.” Statistical Applications in Genetics and Molecular Biology, 7(1): 1-16. · Zbl 1276.62077 · doi:10.2202/1544-6115.1389
[30] Julier, S., Uhlmann, J., and Durrant-Whyte, H. F. (2000). “A new method for the nonlinear transformation of means and covariances in filters and estimators.” IEEE Transactions on automatic control, 45(3): 477-482. · Zbl 0973.93053 · doi:10.1109/9.847726
[31] Lenormand, M., Jabot, F., and Deuant, G. (2013). “Adaptive approximate Bayesian computation for complex models.” Computational Statistics, 6(28): 2777-2796. · Zbl 1306.65088 · doi:10.1007/s00180-013-0428-3
[32] Lintusaari, J., Gutmann, M. U., Dutta, R., Kaski, S., and Corander, J. (2017). “Fundamentals and Recent Developments in Approximate Bayesian Computation.” Systematic Biology, 66(1): e66-e82.
[33] Marin, J.-M., Pudlo, P., Robert, C. P., and Ryder, R. J. (2012). “Approximate Bayesian computational methods.” Statistics and Computing, 22(6): 1167-1180. · Zbl 1252.62022 · doi:10.1007/s11222-011-9288-2
[34] McKinley, T., Cook, A., and Deardon, R. (2009). “Inference in epidemic models without likelihoods.” The International Journal of Biostatistics, 171(5). · doi:10.2202/1557-4679.1171
[35] Numminen, E., Cheng, L., Gyllenberg, M., and Corander, J. (2013). “Estimating the Transmission Dynamics of Streptococcus pneumoniae from Strain Prevalence Data.” Biometrics, 69(3): 748-757. · Zbl 1418.92186 · doi:10.1111/biom.12040
[36] Pritchard, J. K., Seielstad, M. T., and Perez-Lezaun, A. (1999). “Population Growth of Human Y Chromosomes: A study of Y Chromosome Microsatellites.” Molecular Biology and Evolution, 16(12): 1791-1798.
[37] R Core Team (2019). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
[38] Ratmann, O., Camacho, A., Meijer, A., and Donker, G. (2013). “Statistical modelling of summary values leads to accurate Approximate Bayesian computations.” Unpublished.
[39] Robert, C. and Casella, G. (2013). Monte Carlo statistical methods. Springer Science & Business Media. · Zbl 1096.62003 · doi:10.1007/978-1-4757-3071-5
[40] Rubin, D. B. (1984). “Bayesianly justifiable and relevant frequency calculations for the applied statistician.” The Annals of Statistics, 12(4): 1151-1172. · Zbl 0555.62010 · doi:10.1214/aos/1176346785
[41] Schafer, C. M. and Freeman, P. E. (2012). Statistical Challenges in Modern Astronomy V , chapter 1, 3-19. Lecture Notes in Statistics. Springer.
[42] Silk, D., Filippi, S., and Stumpf, M. (2013). “Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems.” Statistical Applications in Genetics and Molecular Biology, 5(12): 603-618. · doi:10.1515/sagmb-2012-0043
[43] Silverman, B. W. (1986). Density estimation for statistics and data analysis, volume 26. CRC press. · Zbl 0617.62042 · doi:10.1007/978-1-4899-3324-9
[44] Silverman, B. W. (2018). Density estimation for statistics and data analysis. Routledge.
[45] Simola, U., Cisewski-Kehe, J., Gutmann, M. U., and Corander, J. (2020). “Supplementary Material of “Adaptive Approximate Bayesian Computation Tolerance Selection”.” Bayesian Analysis. · Zbl 1480.62165 · doi:10.1214/20-BA1211SUPP
[46] Simola, U., Pelssers, B., Barge, D., Conrad, J., and Corander, J. (2019). “Machine learning accelerated likelihood-free event reconstruction in dark matter direct detection.” Journal of Instrumentation, 14(03): P03004.
[47] Sisson, S. A., Fan, Y., and Tanaka, M. M. (2007). “Sequential Monte Carlo without likelihoods.” Proceedings of the National Academy of Science, 104(6): 1760-1765. · Zbl 1160.65005 · doi:10.1073/pnas.0607208104
[48] Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., and Kawanabe, M. (2008). “Direct importance estimation with model selection and its application to covariate shift adaptation.” In Advances in neural information processing systems, 1433-1440.
[49] Sugiyama, M., Suzuki, T., and Kanamori, T. (2010). “Density Ratio Estimation: A Comprehensive Review (Statistical Experiment and Its Related Topics).” · Zbl 1274.62037 · doi:10.1017/CBO9781139035613
[50] Sugiyama, M., Suzuki, T., and Kanamori, T. (2012). Density ratio estimation in machine learning. Cambridge University Press. · Zbl 1274.62037
[51] Tavaré, S., Balding, D. J., Griffiths, R., and Donnelly, P. (1997). “Inferring coalescence times from DNA sequence data.” Genetics, 145: 505-518.
[52] Thornton, K. and Andolfatto, P. (2006). “Inference in epidemic models without likelihoods.” Genetics, 172: 1607-1619.
[53] Toni, T., Welch, D., Strelkowa, N., Ipsen, A., and Stumpf, M. P. H. (2009). “Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems.” Journal of the Royal Society, Interface / the Royal Society, 6(31): 187-202.
[54] Vestrheim, D. F., Høiby, E. A., Aaberge, I. S., and Caugant, D. A. (2010). “Impact of a pneumococcal conjugate vaccination program on carriage among children in Norway.” Clinical and Vaccine Immunology, 17(3): 325-334.
[55] Vestrheim, D. F., Løvoll, Ø., Aaberge, I. S., Caugant, D. A., Høiby, E. A., Bakke, H., and Bergsaker, M. R. (2008). “Effectiveness of a 2+1 dose schedule pneumococcal conjugate vaccination programme on invasive pneumococcal disease among children in Norway.” Vaccine, 26(26): 3277-3281.
[56] Weyant, A., Schafer, C., and Wood-Vasey, W. M. (2013). “Likelihood-free cosmological inference with type Ia supernovae: approximate Bayesian computation for a complete treatment of uncertainty.” The Astrophysical Journal, 764: 116
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.