×

Simulation-based Bayesian inference for epidemic models. (English) Zbl 1471.62137

Summary: A powerful and flexible method for fitting dynamic models to missing and censored data is to use the Bayesian paradigm via data-augmented Markov chain Monte Carlo (DA-MCMC). This samples from the joint posterior for the parameters and missing data, but requires high memory overheads for large-scale systems. In addition, designing efficient proposal distributions for the missing data is typically challenging. Pseudo-marginal methods instead integrate across the missing data using a Monte Carlo estimate for the likelihood, generated from multiple independent simulations from the model. These techniques can avoid the high memory requirements of DA-MCMC, and under certain conditions produce the exact marginal posterior distribution for parameters. A novel method is presented for implementing importance sampling for dynamic epidemic models, by conditioning the simulations on sets of validity criteria (based on the model structure) as well as the observed data. The flexibility of these techniques is illustrated using both removal time and final size data from an outbreak of smallpox. It is shown that these approaches can circumvent the need for reversible-jump MCMC, and can allow inference in situations where DA-MCMC is impossible due to computationally infeasible likelihoods.

MSC:

62-08 Computational methods for problems pertaining to statistics
62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
92D30 Epidemiology

Software:

RladyBug; BayesDA; CODA; R

References:

[1] Andrieu, C.; Doucet, A.; Holenstein, R., Particle Markov chain Monte Carlo methods, Journal of the Royal Statistical Society, Series B (Methodological), 72, 269-342, (2010) · Zbl 1411.65020
[2] Andrieu, C.; Roberts, G. O., The pseudo-marginal approach for efficient Monte Carlo simulation, The Annals of Statistics, 37, 697-725, (2009) · Zbl 1185.60083
[3] Bailey, N. T., The mathematical theory of infectious diseases, (1975), Charles Griffin and Company Ltd. London, High Wycombe · Zbl 0334.92024
[4] Ball, F., A unified approach to the distribution of total size and total area under the trajectory of infectives in epidemic models, Advances in Applied Probability, 18, 289-310, (1986) · Zbl 0606.92018
[5] Beaumont, M. A., Estimation of population growth and decline in genetically monitored populations, Genetics, 164, 1139-1160, (2003)
[6] Beaumont, M. A.; Cornuet, J.-M.; Marin, J.-M.; Robert, C. P., Adaptive approximate Bayesian computation, Biometrika, 96, 983-990, (2009) · Zbl 1437.62393
[7] Beaumont, M.; Zhang, W.; Balding, D., Approximate Bayesian computation in population genetics, Genetics, 162, 2025-2035, (2002)
[8] Becker, N. G., Analysis of data from a single epidemic, The Australian Journal of Statistics, 25, 191-197, (1983)
[9] Becker, N. G., Analysis of infectious disease data, (1989), Chapman and Hall, CRC
[10] Berthier, P.; Beaumont, M. A.; Cornuet, J.-M.; Luikart, G., Likelihood-based estimation of the effective population size using temporal changes in allele frequencies: a genealogical approach, Genetics, 160, 741-751, (2002)
[11] Blum, M. G.B.; Tran, V. C., HIV with contact-tracing: a case study in approximate Bayesian computation, Biostatistics, 11, 644-660, (2010) · Zbl 1437.62399
[12] Boys, R. J.; Giles, P. R., Bayesian inference for stochastic epidemic models with time-inhomogeneous removal rates, Journal of Mathematical Biology, 55, 223-247, (2007) · Zbl 1127.62107
[13] Cappé, O.; Guilin, A.; Marin, J.-M.; Robert, C. P., Population Monte Carlo, Journal of Computational and Graphical Statistics, 13, 4, 907-929, (2004)
[14] Cauchemez, S.; Valleron, A.-J.; Boëlle, P.-Y.; Flahault, A.; Ferguson, N. M., Estimating the impact of school closure on influenza transmission from sentinel data, Nature, 452, 750-755, (2008)
[15] Celeux, G.; Marin, J.-M.; Robert, C. P., Iterated importance sampling in missing data problems, Computational Statistics and Data Analysis, 50, 3386-3404, (2006) · Zbl 1445.62004
[16] Chis Ster, I.; Singh, B. K.; Ferguson, N. M., Epidemiological inference for partially observed epidemics: the example of the 2001 foot and mouth disease epidemic in great britain, Epidemics, 1, 21-34, (2009)
[17] Cook, A.; Otten, W.; Marion, G.; Gibson, G.; Gilligan, C., Estimation of multiple transmission rates for epidemics in heterogeneous populations, Proceedings of the National Academy of Sciences of the United States of America, 104, 20392-20397, (2007)
[18] Deardon, R.; Brooks, S. P.; Grenfell, B. T.; Keeling, M. J.; Tildesley, M. J.; Savill, N. J.; Shaw, D. J.; Woolhouse, M. E., Inference for individual level models of infectious diseases in large populations, Statistica Sinica, 20, 239-261, (2010) · Zbl 1180.62163
[19] Demiris, N.; O’Neill, P. D., Bayesian inference for epidemics with two levels of mixing, Scandinavian Journal of Statistics, 32, 265-280, (2005) · Zbl 1091.62114
[20] Demiris, N.; O’Neill, P. D., Bayesian inference for stochastic multitype epidemics in structured populations via random graphs, Journal of the Royal Statistical Society. Series B (Methodological), 67, 731-745, (2005) · Zbl 1101.62106
[21] Demiris, N.; O’Neill, P. D., Computation of final outcome probabilities for the generalised stochastic epidemic, Statistics and Computing, 16, 309-317, (2006)
[22] Diggle, P. J.; Gratton, R. J., Monte Carlo methods of inference for implicit statistical models (with discussion), Journal of the Royal Statistical Society, Series B (Methodological), 46, 193-227, (1984) · Zbl 0561.62035
[23] (Doucet, A.; Freitas, N. D.; Gordon, N., Sequential Monte Carlo Methods in Practice, (2001), Springer) · Zbl 0967.00022
[24] Eichner, M.; Dietz, K., Transmission potential of smallpox: estimates based on detailed data from an outbreak, American Journal of Epidemiology, 158, 110-117, (2003)
[25] Erhardt, R. J.; Smith, R. L., Approximate Bayesian computing for spatial extremes, Computational Statistics and Data Analysis, 56, 6, 1468-1481, (2012) · Zbl 1246.65023
[26] Fearnhead, P.; Meligkotsidou, L., Exact filtering for partially-observed continuous time models, Journal of the Royal Statistical Society. Series B (Methodological), 66, 771-789, (2004) · Zbl 1046.62100
[27] Fearnhead, P.; Prangle, D., Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation, Journal of the Royal Statistical Society. Series B (Methodological), 74, 419-474, (2012) · Zbl 1411.62057
[28] Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B., Bayesian data analysis, (2004), Chapman and Hall/CRC · Zbl 1039.62018
[29] Gibson, G. J.; Renshaw, E., Estimating parameters in stochastic compartmental models using Markov chain methods, IMA Journal of Mathematics Applied in Medicine and Biology, 15, 19-40, (1998) · Zbl 0916.92024
[30] Gibson, G. J.; Renshaw, E., Likelihood estimation for stochastic compartmental models using Markov chain methods, Statistics and Computing, 11, 347-358, (2001)
[31] (Gilks, W.; Richardson, S.; Spiegelhalter, D., Markov Chain Monte Carlo in Practice, (1996), Chapman and Hall) · Zbl 0832.00018
[32] Gillespie, D. T., Exact stochastic simulation of coupled chemical reactions, The Journal of Physical Chemistry, 81, 2340-2361, (1977)
[33] Green, P. J., Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, 82, 711-732, (1995) · Zbl 0861.62023
[34] Haario, H.; Saksman, E.; Tamminen, J., An adaptive metropolis algorithm, Bernoulli, 7, 223-242, (2001) · Zbl 0989.65004
[35] Hastings, W., Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57, 97-109, (1970) · Zbl 0219.65008
[36] Höhle, M.; Feldmann, U., Rladybug—an R package for stochastic epidemic models, Computational Statistics and Data Analysis, 52, 2, 680-686, (2007) · Zbl 1317.62003
[37] Ionides, E.; Bretó, C.; King, A., Inference for nonlinear dynamical systems, Proceedings of the National Academy of Sciences of the United States of America, 103, 18438-18443, (2006)
[38] Jewell, C. P.; Kypraios, T.; Neal, P.; Roberts, G. O., Bayesian analysis for emerging infectious diseases, Bayesian Analysis, 4, 465-496, (2009) · Zbl 1330.62395
[39] Keeling, M. J.; Rohani, P., Modeling infectious diseases in humans and animals, (2008), Princeton University Press · Zbl 1279.92038
[40] Keeling, M. J.; Ross, J. V., On methods for studying stochastic disease dynamics, Journal of the Royal Society Interface, 5, 171-181, (2008)
[41] Kypraios, T., 2007. Efficient Bayesian inference for partially observed stochastic epidemics and a new class of semi-parametric time series models. Ph.D. Thesis. Lancaster University.
[42] Marjoram, P.; Molitor, J.; Plagnol, V.; Tavaré, S., Markov chain Monte Carlo without likelihoods, Proceedings of the National Academy of Sciences of the United States of America, 100, 15324-15328, (2003)
[43] McKinley, T. J.; Cook, A. R.; Deardon, R., Inference in epidemic models without likelihoods, The International Journal of Biostatistics, 5, (2009)
[44] Metropolis, N.; Rosenbluth, A.; Rosenbluth, M.; Teller, A.; Teller, E., Equations of state calculations by fast computing machine, Journal of Chemical Physics, 21, 1087-1091, (1953) · Zbl 1431.65006
[45] Neal, P., Efficient likelihood-free Bayesian computation for household epidemics, Statistics and Computing, (2010)
[46] O’Neill, P.; Balding, D.; Becker, N.; Eerola, M.; Mollison, D., Analyses of infectious disease data from household outbreaks by Markov chain Monte Carlo methods, Applied Statistics, 49, 517-542, (2000) · Zbl 0965.62098
[47] O’Neill, P. D.; Becker, N. G., Inference for an epidemic when susceptibility varies, Biostatistics, 2, 99-108, (2001) · Zbl 1017.62115
[48] O’Neill, P. D.; Roberts, G. O., Bayesian inference for partially observed stochastic epidemics, Journal of the Royal Statistical Society. Series A (General), 162, 121-129, (1999)
[49] O’Ryan, C.; Harley, E. H.; Bruford, M. W.; Beaumont, M.; Wayne, R. K.; Cherry, M. I., Microsatellite analysis of genetic diversity in fragmented south african buffalo populations, Animal Conservation, 1, 85-94, (1998)
[50] Papaspiliopoulos, O.; Roberts, G. O.; Sköld, M., Non-centered parameterizations for hierarchical models and data augmentation, (Bernardo, J. M.; Bayarri, M. J.; Berger, J. O.; Dawid, A. P.; Heckerman, D.; Smith, A. F.M.; West, M., Bayesian Statistics, Vol. 7, (1998), Oxford University Press), 307-326
[51] Plummer, M., Best, N., Cowles, K., Vines, K., 2010. Coda: output analysis and diagnostics for MCMC, R Package Version 0.14-2.
[52] R Development Core Team, R: A language and environment for statistical computing, (2011), R Foundation for Statistical Computing Vienna, Austria
[53] Rida, W. N., Asymptotic properties of some estimators for the infection rate in the general stochastic epidemic model, Journal of the Royal Statistical Society. Series B (Methodological), 53, 269-283, (1991) · Zbl 0800.62737
[54] Roberts, G. O.; Rosenthal, J. S., Examples of adaptive MCMC, Journal of Computational and Graphical Statistics, 18, 349-367, (2009)
[55] Ross, J.; Taimre, T.; Pollett, P., On parameter estimation in population models, Theoretical Population Biology, 70, 498-510, (2006) · Zbl 1118.92052
[56] Sisson, S.; Fan, Y.; Tanaka, M. M., Sequential Monte Carlo without likelihoods, Proceedings of the National Academy of Sciences of the United States of America, 104, 1760-1765, (2007) · Zbl 1160.65005
[57] Tanaka, M. M.; Francis, A. R.; Luciani, F.; Sisson, S., Using approximate Bayesian computation to estimate tuberculosis transmission parameters from genotype data, Genetics, 173, 1511-1520, (2006)
[58] Tavaré, S.; Balding, D. J.; Griffiths, R.; Donnelly, P., Inferring coalescence times from DNA sequence data, Genetics, 145, 505-518, (1997)
[59] Toni, T.; Welch, D.; Strelkowa, N.; Ipsen, A.; Strumpf, M. P., Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems, Journal of the Royal Society Interface, 6, 187-202, (2009)
[60] Weirman, J. C.; Marchette, D. J., Modeling computer virus prevalence with a susceptible-infected-susceptible model with reintroduction, Computational Statistics and Data Analysis, 45, 1, 3-23, (2004) · Zbl 1429.68037
[61] Wilkinson, R.D., 2010. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error (submitted for publication).
[62] Wong, H.; Shao, Q.; Ip, W., Modeling respiratory illnesses with change point: a lesson from the SARS epidemic in Hong Kong, Computational Statistics and Data Analysis, 57, 1, 589-599, (2013)
[63] Yang, Y.; Longini, I. M.; Halloran, E., A data-augmentation method for infectious disease incidence data from close contact groups, Computational Statistics and Data Analysis, 51, 12, 6582-6595, (2007) · Zbl 1445.62300
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.