×

Approximate Bayesian computation and simulation-based inference for complex stochastic epidemic models. (English) Zbl 1407.62406

Summary: Approximate Bayesian Computation (ABC) and other simulation-based inference methods are becoming increasingly used for inference in complex systems, due to their relative ease-of-implementation. We briefly review some of the more popular variants of ABC and their application in epidemiology, before using a real-world model of HIV transmission to illustrate some of challenges when applying ABC methods to high-dimensional, computationally intensive models. We then discuss an alternative approach – history matching – that aims to address some of these issues, and conclude with a comparison between these different methodologies.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
92C60 Medical epidemiology

Software:

epiABC; abc; GPS-ABC

References:

[1] Andrianakis, I., Vernon, I., McCreesh, N., McKinley, T. J., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2015). Bayesian history matching of complex infectious disease models using emulation: A tutorial and a case study on HIV in Uganda. PLoS Comput. Biol.11. e1003968. · Zbl 1387.62111 · doi:10.1137/16M1093008
[2] Andrianakis, I., McCreesh, N., Vernon, I., McKinley, T. J., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2017). History matching of a high dimensional HIV transmission individual based model. SIAM/ASA J. Uncertain. Quantificat.5 694-719. · Zbl 1387.62111 · doi:10.1137/16M1093008
[3] Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol.72 269-342. · Zbl 1184.65001
[4] Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist.37 697-725. · Zbl 1185.60083 · doi:10.1214/07-AOS574
[5] Barnes, C. P., Filippi, S., Stumpf, M. P. H. and Thorne, T. (2012). Considerate approaches to constructing summary statistics for ABC model selection. Stat. Comput.22 1181-1197. · Zbl 1252.62002 · doi:10.1007/s11222-012-9335-7
[6] Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics164 1139-1160.
[7] Beaumont, M. A. (2010). Approximate Bayesian Computation in evolution and ecology. Annu. Rev. Ecol. Evol. Syst.41 379-406.
[8] Beaumont, M. A., Zhang, W. and Balding, D. J. (2002). Approximate Bayesian Computation in population genetics. Genetics162 2025-2035.
[9] Beaumont, M. A., Cornuet, J.-M., Marin, J.-M. and Robert, C. P. (2009). Adaptive approximate Bayesian computation. Biometrika96 983-990. · Zbl 1437.62393 · doi:10.1093/biomet/asp052
[10] Blum, M. G. B. and François, O. (2010). Non-linear regression models for approximate Bayesian computation. Stat. Comput.20 63-73.
[11] Bornn, L., Pillai, N. S., Smith, A. and Woodard, D. (2017). The use of a single pseudo-sample in approximate Bayesian computation. Stat. Comput.27 583-590. · Zbl 1505.62074 · doi:10.1007/s11222-016-9640-7
[12] Bortot, P., Coles, S. G. and Sisson, S. A. (2007). Inference for stereological extremes. J. Amer. Statist. Assoc.102 84-92. · Zbl 1284.62795 · doi:10.1198/016214506000000988
[13] Brooks Pollock, E., Roberts, G. O. and Keeling, M. J. (2014). A dynamic model of bovine tuberculosis spread and control in Great Britain. Nature511 228-231.
[14] Cameron, E., Battle, K. E., Bhatt, S., Weiss, D. J., Bisanzio, D., Mappin, B., Dalrymple, U., Hay, S. I., Smith, D. L., Griffin, J. T., Wenger, E. A., Eckhoff, P. A., Smith, T. A., Penny, M. A. and Gething, P. W. (2015). Defining the relationship between infection prevalence and clinical incidence of Plasmodium falciparum malaria. Nat. Commun.6 8170.
[15] Conlan, A. J. K., McKinley, T. J., Karolemeas, K., Pollock, E. B., Goodchild, A. V., Mitchell, A. P., Birch, C. P. D., Clifton-Hadley, R. S. and Wood, J. L. N. (2012). Estimating the hidden burden of bovine tuberculosis in Great Britain. PLoS Comput. Biol.8 e1002730.
[16] Craig, P. S., Goldstein, M., Seheult, A. H. and Smith, J. A. (1997). Pressure matching for hydrocarbon reservoirs: A case study in the use of Bayes linear strategies for large computer experiments. In Case Studies in Bayesian Statistics. 37-93. Springer. · Zbl 0895.62105
[17] Csilléry, K., Blum, M. G. B., Gaggiotti, O. E. and François, O. (2010). Approximate Bayesian Computation (ABC) in practice. Trends Ecol. Evol.25 410-418.
[18] Del Moral, P., Doucet, A. and Jasra, A. (2012). An adaptive sequential Monte Carlo method for approximate Bayesian computation. Stat. Comput.22 1009-1020. · Zbl 1252.65025 · doi:10.1007/s11222-011-9271-y
[19] Diggle, P. J. and Gratton, R. J. (1984). Monte Carlo methods of inference for implicit statistical models. J. Roy. Statist. Soc. Ser. B46 193-227. · Zbl 0561.62035
[20] Doucet, A., Pitt, M. K., Deligiannidis, G. and Kohn, R. (2015). Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika102 295-313. · Zbl 1452.62055 · doi:10.1093/biomet/asu075
[21] Drovandi, C. C. and Pettitt, A. N. (2011). Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics67 225-233. · Zbl 1217.62128 · doi:10.1111/j.1541-0420.2010.01410.x
[22] Drovandi, C. C., Pettitt, A. N. and Faddy, M. J. (2011). Approximate Bayesian computation using indirect inference. J. R. Stat. Soc. Ser. C. Appl. Stat.60 317-337. · doi:10.1111/j.1467-9876.2010.00747.x
[23] Drovandi, C. C., Pettitt, A. N. and Lee, A. (2015). Bayesian indirect inference using a parametric auxiliary model. Statist. Sci.30 72-95. · Zbl 1332.62088 · doi:10.1214/14-STS498
[24] Drovandi, C. C., Pettitt, A. N. and McCutchan, R. A. (2016). Exact and approximate Bayesian inference for low integer-valued time series models with intractable likelihoods. Bayesian Anal.11 325-352. · Zbl 1359.62365 · doi:10.1214/15-BA950
[25] Fearnhead, P. and Prangle, D. (2012). Constructing summary statistics for approximate Bayesian computation: Semi-automatic approximate Bayesian computation. J. R. Stat. Soc. Ser. B. Stat. Methodol.74 419-474. · Zbl 1411.62057 · doi:10.1111/j.1467-9868.2011.01010.x
[26] Filippi, S., Barnes, C. P., Cornebise, J. and Stumpf, M. P. H. (2013). On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo. Stat. Appl. Genet. Mol. Biol.12 87-107.
[27] Gibson, G. J. and Renshaw, E. (1998). Estimating parameters in stochastic compartmental models using Markov chain methods. IMA J. Math. Appl. Med. Biol.15 19-40. · Zbl 0916.92024 · doi:10.1093/imammb/15.1.19
[28] Goldstein, M. and Rougier, J. (2009). Reified Bayesian modelling and inference for physical systems. J. Statist. Plann. Inference139 1221-1239. · Zbl 1156.62316 · doi:10.1016/j.jspi.2008.07.019
[29] Goldstein, M., Seheult, A. and Vernon, I. (2013). Assessing Model Adequacy, 2nd ed. Wiley, UK.
[30] Gouriéroux, C., Monfort, A. and Renault, E. (1993). Indirect inference. J. Appl. Econometrics8 S85-S118. · Zbl 1448.62202
[31] Henderson, D. A., Boys, R. J., Krishnan, K. J., Lawless, C. and Wilkinson, D. J. (2009). Bayesian emulation and calibration of a stochastic computer model of mitochondrial DNA deletions in substantia nigra neurons. J. Amer. Statist. Assoc.104 76-87. · Zbl 1388.92007 · doi:10.1198/jasa.2009.0005
[32] Holden, P. B., Edwards, N. R., Hensman, J. and Wilkinson, R. D. (2016). ABC for climate: Dealing with expensive simulators. Handbook of Approximate Bayesian Computation (ABC). Available at 1511.03475.
[33] Ionides, E. L., Bretó, C. and King, A. A. (2006). Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA103 18438-18443.
[34] Ionides, E. L., Bhadra, A., Atchadé, Y. and King, A. (2011). Iterated filtering. Ann. Statist.39 1776-1802. · Zbl 1220.62103 · doi:10.1214/11-AOS886
[35] Ionides, E. L., Nguyen, D., Atchadé, Y., Stoev, S. and King, A. A. (2015). Inference for dynamic and latent variable models via iterated, perturbed Bayes maps. Proc. Natl. Acad. Sci. USA112 719-724. · Zbl 1359.62345 · doi:10.1073/pnas.1410597112
[36] Jabot, F., Lagarrigues, G., Courbaud, B. and Dumoulin, N. (2014). A comparison of emulation methods for Approximate Bayesian Computation. Available at http://arxiv.org/abs/1412.7560.
[37] Jandarov, R., Haran, M., Bjørnstad, O. and Grenfell, B. (2014). Emulating a gravity model to infer the spatiotemporal dynamics of an infectious disease. J. R. Stat. Soc. Ser. C. Appl. Stat.63 423-444. · doi:10.1111/rssc.12042
[38] Jewell, C. P., Kypraios, T., Christley, R. M. and Roberts, G. O. (2009). A novel approach to real-time risk prediction for emerging infectious diseases: A case study in avian influenza H5N1. Prev. Vet. Med.91 19-28.
[39] Joyce, P. and Marjoram, P. (2008). Approximately sufficient statistics and Bayesian computation. Stat. Appl. Genet. Mol. Biol.7. · Zbl 1276.62077 · doi:10.2202/1544-6115.1389
[40] Kypraios, T., Neal, P. and Prangle, D. (2017). A tutorial introduction to Bayesian inference for stochastic epidemic models using approximate Bayesian computation. Math. Biosci.287 42-53. · Zbl 1377.92091 · doi:10.1016/j.mbs.2016.07.001
[41] Lenormand, M., Jabot, F. and Deffuant, G. (2013). Adaptive approximate Bayesian computation for complex models. Comput. Statist.28 2777-2796. · Zbl 1306.65088 · doi:10.1007/s00180-013-0428-3
[42] Marin, J.-M., Pudlo, P., Robert, C. P. and Ryder, R. J. (2012). Approximate Bayesian computational methods. Stat. Comput.22 1167-1180. · Zbl 1252.62022 · doi:10.1007/s11222-011-9288-2
[43] Marjoram, P., Molitor, J., Plagnol, V. and Tavaré, S. (2003). Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA100 15324-15328.
[44] McCreesh, N., Andrianakis, I., Nsubuga, R. N., Strong, M., Vernon, I., McKinley, T. J., Oakley, J. E., Goldstein, M., Hayes, R. and White, R. G. (2017). Universal, test, treat, and keep: Improving ART retention is key in cost-effective HIV care and control in Uganda. BMC Infect. Dis.. To appear.
[45] McKinley, T., Cook, A. R. and Deardon, R. (2009). Inference in epidemic models without likelihoods. Int. J. Biostat.5. · doi:10.2202/1557-4679.1171
[46] McKinley, T. J., Ross, J. V., Deardon, R. and Cook, A. R. (2014). Simulation-based Bayesian inference for epidemic models. Comput. Statist. Data Anal.71 434-447. · Zbl 1471.62137
[47] McKinley, T. J, Vernon, I., Andrianakis, I., McCreesh, N., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2017). Supplement to “Approximate Bayesian computation and simulation-based inference for complex stochastic epidemic models.” DOI:10.1214/17-STS618SUPPA, DOI:10.1214/17-STS618SUPPB. · Zbl 1387.62111 · doi:10.1137/16M1093008
[48] Meeds, E. and Welling, M. (2014). GPS-ABC: Gaussian process surrogate Approximate Bayesian Computation. Available at http://arxiv.org/abs/1401.2838v1.
[49] Neal, P. (2012). Efficient likelihood-free Bayesian computation for household epidemics. Stat. Comput.22 1239-1256. · Zbl 1252.62112 · doi:10.1007/s11222-010-9216-x
[50] Nunes, M. A. and Balding, D. J. (2010). On optimal selection of summary statistics for approximate Bayesian computation. Stat. Appl. Genet. Mol. Biol.9. · Zbl 1304.92047 · doi:10.2202/1544-6115.1576
[51] O’Neill, P. D. and Roberts, G. O. (1999). Bayesian inference for partially observed stochastic epidemics. J. R. Stat. Soc., A162 121-129.
[52] O’Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000). Analyses of infectious disease data from household outbreaks by Markov chain Monte Carlo methods. J. Roy. Statist. Soc. Ser. C49 517-542. · Zbl 0965.62098 · doi:10.1111/1467-9876.00210
[53] Oakley, J. E. and Youngman, B. D. (2017). Calibration of stochastic computer simulators using likelihood emulation. Technometrics59 80-92.
[54] Pitt, M. K., Silva, R. d. S., Giordani, P. and Kohn, R. (2012). On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econometrics171 134-151. · Zbl 1443.62499 · doi:10.1016/j.jeconom.2012.06.004
[55] Pukelsheim, F. (1994). The three sigma rule. Amer. Statist.48 88-91.
[56] Ratmann, O., Jørgensen, O., Hinkley, T., Stumpf, M., Richardson, S. and Wiuf, C. (2007). Using likelihood-free inference to compare evolutionary dynamics of the protein networks of H. pylori and P. falciparum. PLoS Comput. Biol.3 2266-2278.
[57] Ratmann, O., Andrieu, C., Wiuf, C. and Richardson, S. (2009). Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc. Natl. Acad. Sci. USA106 10576-10581.
[58] Ratmann, O., Camacho, A., Meijer, A. and Donker, G. (2014). Statistical modelling of summary values leads to accurate Approximate Bayesian Computations. Available at arXiv:1305.4283v2.
[59] Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist.12 1151-1172. · Zbl 0555.62010 · doi:10.1214/aos/1176346785
[60] Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci.4 409-435. · Zbl 0955.62619 · doi:10.1214/ss/1177012413
[61] Sherlock, C., Thiery, A. H., Roberts, G. O. and Rosenthal, J. S. (2015). On the efficiency of pseudo-marginal random walk Metropolis algorithms. Ann. Statist.43 238-275. · Zbl 1326.65015 · doi:10.1214/14-AOS1278
[62] Silk, D., Filippi, S. and Stumpf, M. P. H. (2012). Optimizing threshold-schedules for approximate Bayesian computation sequential Monte Carlo samplers: applications to molecular systems. Available at arXiv:1210.3296v1.
[63] Sisson, S. A., Fan, Y. and Tanaka, M. M. (2007). Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA104 1760-1765. · Zbl 1160.65005 · doi:10.1073/pnas.0607208104
[64] Tavaré, S., Balding, D. J., Griffiths, R. C. and Donnelly, P. (1997). Inferring coalescence times from DNA sequence data. Genetics145 505-518.
[65] Toni, T., Welch, D., Strelkowa, N., Ipsen, A. and Strumpf, M. P. H. (2009). Approximate Bayesian Computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface6 187-202.
[66] Vernon, I., Goldstein, M. and Bower, R. G. (2010). Galaxy formation: A Bayesian uncertainty analysis. Bayesian Anal.5 619-669. · Zbl 1330.85005 · doi:10.1214/10-BA524
[67] Vernon, I., Goldstein, M. and Bower, R. (2014). Galaxy formation: Bayesian history matching for the observable universe. Statist. Sci.29 81-90. · Zbl 1332.85007 · doi:10.1214/12-STS412
[68] Wilkinson, R. D. (2013). Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat. Appl. Genet. Mol. Biol.12 129-141.
[69] Wilkinson, R. D. (2014). Accelerating ABC methods using Gaussian processes. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 33 1015-1023.
[70] Wood, S. N. (2010). Statistical inference for noisy nonlinear ecological dynamic systems. Nature466 1102-1104.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.