×

Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: a dengue case study. (English) Zbl 1393.62071

Summary: In 2015 the US federal government sponsored a dengue forecasting competition using historical case data from Iquitos, Peru and San Juan, Puerto Rico. Competitors were evaluated on several aspects of out-of-sample forecasts including the targets of peak week, peak incidence during that week, and total season incidence across each of several seasons. Our team was one of the winners of that competition, outperforming other teams in multiple targets/locales. In this paper we report on our methodology, a large component of which, surprisingly, ignores the known biology of epidemics at large – for example, relationships between dengue transmission and environmental factors – and instead relies on flexible nonparametric nonlinear Gaussian process (GP) regression fits that “memorize” the trajectories of past seasons, and then “match” the dynamics of the unfolding season to past ones in real-time. Our phenomenological approach has advantages in situations where disease dynamics are less well understood, or where measurements and forecasts of ancillary covariates like precipitation are unavailable, and/or where the strength of association with cases are as yet unknown. In particular, we show that the GP approach generally outperforms a more classical generalized linear (autoregressive) model (GLM) that we developed to utilize abundant covariate information. We illustrate variations of our method(s) on the two benchmark locales alongside a full summary of results submitted by other contest competitors.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62M20 Inference from stochastic processes and prediction
62M30 Inference from spatial processes
92C60 Medical epidemiology

Software:

GitHub; laGP; vbdcast; R; S-PLUS

References:

[1] Ankenman, B., Nelson, B. L. and Staum, J. (2010). Stochastic kriging for simulation metamodeling. Oper. Res.58 371-382. · Zbl 1342.62134
[2] Barrera, R., Amador, M. and MacKay, A. J. (2011). Population dynamics of Aedes aegypti and dengue as influenced by weather and human behavior in San Juan, Puerto Rico. PLoS Negl. Trop. Dis.5 e1378.
[3] Binois, M., Gramacy, R. B. and Ludkovski, M. (2016). Practical heteroskedastic Gaussian process modeling for large simulation experiments. arXiv preprint, arXiv:1611.05902.
[4] Bornn, L., Shaddick, G. and Zidek, J. V. (2012). Modeling nonstationary processes through dimension expansion. J. Amer. Statist. Assoc.107 281-289. · Zbl 1261.62085
[5] Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York. Revised reprint of the 1991 edition. · Zbl 1347.62005
[6] Degallier, N., Favier, C., Menkes, C., Lengaigne, M., Ramalho, W. M., Souza, R., Servain, J. and Boulanger, J.-P. (2010). Toward an early warning system for dengue prevention: Modeling climate impact on dengue transmission. Clim. Change 98 581-592.
[7] Elderd, B. D., Dukic, V. M. and Dwyer, G. (2006). Uncertainty in predictions of disease spread and public health responses to bioterrorism and emerging diseases. Proc. Natl. Acad. Sci. USA 103 15693-15697.
[8] Farah, M., Birrell, P., Conti, S. and De Angelis, D. (2014). Bayesian emulation and calibration of a dynamic epidemic model for A/H1N1 influenza. J. Amer. Statist. Assoc.109 1398-1411.
[9] Gagnon, A. S., Bush, A. B. and Smoyer-Tomic, K. E. (2001). Dengue epidemics and the El Niño southern oscillation. Clim. Res.19 35-43.
[10] Gneiting, T. (2011). Making and evaluating point forecasts. J. Amer. Statist. Assoc.106 746-762. · Zbl 1232.62028
[11] Gneiting, T. (2017). When is the mode functional the Bayes classifier? Stat 6 204-206. · Zbl 07850032
[12] Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc.102 359-378. · Zbl 1284.62093
[13] Gneiting, T., Larson, K., Westrick, K., Genton, M. G. and Aldrich, E. (2006). Calibrated probabilistic forecasting at the Stateline wind energy center: The regime-switching space-time method. J. Amer. Statist. Assoc.101 968-979. · Zbl 1120.62341
[14] Gramacy, R. B. (2014). \tt laGP: Local approximate Gaussian process regression. R package version 1.1-4.
[15] Gramacy, R. B. (2016). laGP: Large-scale spatial modeling via local approximate Gaussian processes in R. J. Stat. Softw.72 1-46.
[16] Hu, R. and Ludkovsk, M. (2017). Sequential design for ranking response surfaces. SIAM/ASA J. Uncertain. Quantificat.5 212-239. · Zbl 1365.62319
[17] Johansson, M. A., Cummings, D. A. T. and Glass, G. E. (2009). Multiyear climate variability and dengue-El Niño southern oscillation, weather, and dengue incidence in Puerto Rico, Mexico, and Thailand: A longitudinal data analysis. PLoS Med.6 e1000168.
[18] Johnson, L. R. and Gramacy, R. B. (2017). vbdcast: Vector-borne disease forecasting. Technical report. https://github.com/lrjohnson0/vbdcast.
[19] Johnson, L. R., Ben-Horin, T., Lafferty, K. D., McNally, A., Mordecai, E., Paaijmans, K. P., Pawar, S. and Ryan, S. J. (2015). Understanding uncertainty in temperature effects on vector-borne disease: A Bayesian approach. Ecology 96 203-213.
[20] Johnson, L. R., Gramacy, R. B., Cohen, J., Mordecai, E., Murdock, C., Rohr, J., Ryan, S. J., Stewart-Ibarra, A. M. and Weikel, D. (2018). Supplement to “Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: A dengue case study.” DOI:10.1214/17-AOAS1090SUPPA, DOI:10.1214/17-AOAS1090SUPPB. · Zbl 1393.62071
[21] Koepke, A. A., Longini Jr., I. M., Halloran, M. E., Wakefield, J. and Minin, V. N. (2016). Predictive modeling of cholera outbreaks in Bangladesh. Ann. Appl. Stat.10 575. · Zbl 1400.62271
[22] Kuhn, K., Campbell-Lendrum, D., Haines, A., Cox, J., Corvalán, C., Anker, M. et al. (2005). Using climate to predict infectious disease epidemics. White Paper, World Health Organization, Geneva. www.who.int/globalchange/publications/infectdiseases/en/index.html.
[23] Lambrechts, L., Paaijmans, K. P., Fansiri, T., Carrington, L. B., Kramer, L. D., Thomas, M. B. and Scott, T. W. (2011). Impact of daily temperature fluctuations on dengue virus transmission by Aedes aegypti. Proc. Natl. Acad. Sci. USA 108 7460-7465.
[24] Ludkovski, M. and Niemi, J. (2010). Optimal dynamic policies for influenza management. Stat. Commun. Infec. Dis.2 Art. 5, 27.
[25] Matheron, G. (1963). Principles of geostatistics. Econ. Geol.58 1246-1266.
[26] Merl, D., Johnson, L. R., Gramacy, R. B. and Mangel, M. (2009). A statistical framework for the adaptive management of epidemiological interventions. PLoS ONE 4 e5807.
[27] Moore, C. G., Cline, B. L., Ruiz-Tibén, E., Lee, D., Romney-Joseph, H. and Rivera-Correa, E. (1978). Aedes aegypti in Puerto Rico: Environmental determinants of larval abundance and relation to dengue virus transmission. Am. J. Trop. Med. Hyg.27 1225-1231.
[28] Mordecai, E. A., Paaijmans, K. P., Johnson, L. R., Balzer, C., Ben-Horin, T., de Moor, E., McNally, A., Pawar, S., Ryan, S. J., Smith, T. C. and Lafferty, K. D. (2013). Optimal temperature for malaria transmission is dramatically lower than previously predicted. Ecol. Lett.16 22-30.
[29] Mordecai, E., Cohen, J., Evans, M. V., Gudapati, P., Johnson, L. R., Lippi, C. A., Miazgowicz, K., Murdock, C. C., Rohr, J. R., Ryan, S. J., Savage, V., Shocket, M., Stewart Ibarra, A., Thomas, M. B. and Weikel, D. P. (2017). Detecting the impact of temperature on transmission of Zika, dengue and chikungunya using mechanistic models. PLoS Negl. Trop. Dis.11 e0005568.
[30] Osthus, D., Hickmann, K. S., Caragea, P. C., Higdon, D. and Del Valle, S. Y. (2017). Forecasting seasonal influenza with a state – space SIR model. Ann. Appl. Stat.11 202-224. · Zbl 1366.62236
[31] R Development Core Team (2008). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
[32] Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. The MIT Press. · Zbl 1177.68165
[33] Ray, E. L., Sakrejda, K., Lauer, S. A., Johansson, M. A. and Reich, N. G. (2017). Infectious disease prediction with kernel conditional density estimation. Technical report. github.com/reichlab/article-disease-pred-with-kcde.
[34] Reynolds, R. W., Rayner, N. A., Smith, T. M., Stokes, D. C. and Wang, W. (2002). An improved in situ and satellite SST analysis for climate. J. Climate 15 1609-1625.
[35] Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci.4 409-435. With comments and a rejoinder by the authors. · Zbl 0955.62619
[36] Stewart-Ibarra, A. M. and Lowe, R. (2013). Climate and non-climate drivers of dengue epidemics in southern coastal Ecuador. Am. J. Trop. Med. Hyg.88 971-981.
[37] Stewart-Ibarra, A. M., Ryan, S. J., Beltrán, E., Mejía, R., Silva, M. and Muñoz, Á. (2013). Dengue vector dynamics (Aedes aegypti) influenced by climate and social factors in Ecuador: Implications for targeted control. PLoS ONE 8 e78263.
[38] Thomson, M. C., Garcia-Herrera, R. and Beniston, M. (2008). Seasonal Forecasts, Climatic Change and Human Health. Springer.
[39] Venables, W. N. and Ripley, B. D. (1994). Modern Applied Statistics with S-Plus. Springer, New York. · Zbl 0806.62002
[40] World Health Organization (2009). Dengue: Guidelines for diagnosis, treatment, prevention and control. Special Programme for Research and Training in Tropical Diseases, Department of Control of Neglected Tropical Diseases, and Epidemic and Pandemic Alert, World Health Organization.
[41] World Health Organization (2016). Dengue vaccine: WHO position paper—July 2016. Weekly Epidemiological Record 91 349-364.
[42] Xu, L., Stige, L. C., Chan, K.-S., Zhou, J., Yang, J., Sang, S., Wang, M., Yang, Z., Yan, Z., Jiang, T., Lu, L., Yue, Y., Liu, X., Lin, H., Xu, J., Liu, Q. and Stenseth, N. C. (2016). Climate variation drives dengue dynamics. Proc. Natl. Acad. Sci. USA 201618558.
[43] Yamana, T. K., Kandula, S. and Shaman, J. (2016). Superensemble forecasts of dengue outbreaks. J. R. Soc. Interface 13 20160410.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.