×

Scalable visualization methods for modern generalized additive models. (English) Zbl 07499273

Summary: In the last two decades, the growth of computational resources has made it possible to handle generalized additive models (GAMs) that formerly were too costly for serious applications. However, the growth in model complexity has not been matched by improved visualizations for model development and results presentation. Motivated by an industrial application in electricity load forecasting, we identify the areas where the lack of modern visualization tools for GAMs is particularly severe, and we address the shortcomings of existing methods by proposing a set of visual tools that (a) are fast enough for interactive use, (b) exploit the additive structure of GAMs, (c) scale to large data sets, and (d) can be used in conjunction with a wide range of response distributions. The new visual methods proposed here are implemented by the mgcViz R package, available on the Comprehensive R Archive Network. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

References:

[1] Augustin, N. H.; Sauleau, E.-A.; Wood, S. N., “On Quantile Quantile Plots for Generalized Linear Models,”, Computational Statistics & Data Analysis, 56, 2404-2409 (2012) · Zbl 1252.62072 · doi:10.1016/j.csda.2012.01.026
[2] Ben, M. G.; Yohai, V. J., “Quantile-quantile Plot for Deviance Residuals in the Generalized Linear Model,”, Journal of Computational and Graphical Statistics, 13, 36-47 (2004) · doi:10.1198/1061860042949_a
[3] Bowman, D. W., “Graphics for Uncertainty,”, Journal of the Royal Statistical Society, 182, 1-16 (2018)
[4] Buuren, S. v.; Fredriks, M., “Worm Plot: A Simple Diagnostic Device for Modelling Growth Reference Curves,”, Statistics in Medicine, 20, 1259-1277 (2001) · doi:10.1002/sim.746
[5] Carr, D., Lewin-Koh, N., and Maechler, M. (2011), “hexbin: Hexagonal Binning Routines,” R package version 1.27.2.
[6] Carr, D. B.; Littlefield, R. J.; Nicholson, W.; Littlefield, J., “Scatterplot Matrix Techniques for Large n,”, Journal of the American Statistical Association, 82, 424-436 (1987) · doi:10.2307/2289444
[7] Chang, W., Cheng, J., Allaire, J., Xie, Y., and McPherson, J. (2018) “shiny: Web Application Framework For r,” R package version 1.1.0.
[8] Cox, D. R.; Snell, E. J., “A General Definition of Residuals,”, Journal of the Royal Statistical Society, 30, 248-275 (1968) · Zbl 0164.48903 · doi:10.1111/j.2517-6161.1968.tb00724.x
[9] Czado, C.; Gneiting, T.; Held, L., “Predictive Model Assessment for Count Data,”, Biometrics, 65, 1254-1261 (2009) · Zbl 1180.62162 · doi:10.1111/j.1541-0420.2009.01191.x
[10] Dunn, P. K.; Smyth, G. K., “Randomized Quantile Residuals,”, Journal of Computational and Graphical Statistics, 5, 236-244 (1996) · doi:10.2307/1390802
[11] Fasiolo, M.; Goude, Y.; Nedellec, R.; Wood, S. N., Fast Calibrated Additive Quantile Regression, arXiv preprint arXiv:1707.03307 (2017)
[12] Jones, M.; Pewsey, A., “Sinh-arcsinh Distributions,”, Biometrika, 96, 761-780 (2009) · Zbl 1183.62019 · doi:10.1093/biomet/asp053
[13] McLean, M. W.; Hooker, G.; Staicu, A.-M.; Scheipl, F.; Ruppert, D., “Functional Generalized Additive Models,”, Journal of Computational and Graphical Statistics, 23, 249-269 (2014) · doi:10.1080/10618600.2012.729985
[14] Michael, J. R., “The Stabilized Probability Plot,”, Biometrika, 70, 11-17 (1983) · doi:10.1093/biomet/70.1.11
[15] Murdoch, D., Rgl: An r Interface to Opengl, 2 (2001)
[16] Pierce, D. A.; Schafer, D. W., “Residuals in Generalized Linear Models,”, Journal of the American Statistical Association, 81, 977-986 (1986) · Zbl 0644.62076 · doi:10.1080/01621459.1986.10478361
[17] Rigby, R. A.; Stasinopoulos, D. M., “Generalized Additive Models for Location, Scale and Shape,”, Journal of the Royal Statistical Society, Series C, 54, 507-554 (2005) · Zbl 1490.62201 · doi:10.1111/j.1467-9876.2005.00510.x
[18] Sievert, C., Parmer, C., Hocking, T., Chamberlain, S., Ram, K., Corvellec, M., and Despouy, P. (2017), “plotly: Create Interactive Web Graphics Via ’Plotly.js’,” R package version 4.9.0.
[19] Wand, M., and Ripley, B. (2006), “KernSmooth: Functions for Kernel Smoothing for Wand & Jones (1995),” R package version 2.23.
[20] Wand, M. P., “Fast Computation of Multivariate Kernel Estimators,”, Journal of Computational and Graphical Statistics, 3, 433-445 (1994) · doi:10.2307/1390904
[21] Wand, M. P., “Fast Approximate Inference for Arbitrarily Large Semiparametric Regression Models Via Message Passing,”, Journal of the American Statistical Association, 112, 137-168 (2017) · doi:10.1080/01621459.2016.1197833
[22] Wickham, H., ggplot2: Elegant Graphics for Data Analysis (2009), New York: Springer-Verlag, New York · Zbl 1170.62004
[23] Wickham, H., “A Layered Grammar of Graphics,”, Journal of Computational and Graphical Statistics, 19, 3-28 (2010) · doi:10.1198/jcgs.2009.07098
[24] Wickham, H. (2013), “Bin-summarise-smooth: A Framework for Visualising Large Data,” had. co. nz, Tech. Rep.
[25] Wickham, H.; Hofmann, H.; Wickham, C.; Cook, D., “Glyph-maps for Visually Exploring Temporal Patterns in Climate Data and Models,”, Environmetrics, 23, 382-393 (2012) · doi:10.1002/env.2152
[26] Woo, M.; Neider, J.; Davis, T.; Shreiner, D., OpenGL Programming Guide: The Official Guide to Learning OpenGL, version 1.2 (1999), Boston: Addison-Wesley Longman Publishing Co., Inc, Boston
[27] Wood, S. N., Generalized Additive Models: An Introduction with R (2017), Boca Raton, FL: CRC Press, Boca Raton, FL · Zbl 1368.62004
[28] Wood, S. N.; Goude, Y.; Shaw, S., “Generalized Additive Models for Large Data Sets,”, Journal of the Royal Statistical Society, 64, 139-155 (2015) · doi:10.1111/rssc.12068
[29] Wood, S. N.; Pya, N.; Säfken, B., “Smoothing Parameter and Model Selection for General Smooth Models,”, Journal of the American Statistical Association, 111, 1548-1575 (2016) · doi:10.1080/01621459.2016.1180986
[30] Wood, S. N.; Li, Z.; Shaddick, G.; Augustin, N. H., “Generalized Additive Models for gigadata: Modeling the Uk Black Smoke Network Daily Data,”, Journal of the American Statistical Association, 112, 1199-1210 (2017) · doi:10.1080/01621459.2016.1195744
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.