×

Bayesian nonparametric quantile process regression and estimation of marginal quantile effects. (English) Zbl 1522.62267

Summary: Flexible estimation of multiple conditional quantiles is of interest in numerous applications, such as studying the effect of pregnancy-related factors on low and high birth weight. We propose a Bayesian nonparametric method to simultaneously estimate noncrossing, nonlinear quantile curves. We expand the conditional distribution function of the response in I-spline basis functions where the covariate-dependent coefficients are modeled using neural networks. By leveraging the approximation power of splines and neural networks, our model can approximate any continuous quantile function. Compared to existing models, our model estimates all rather than a finite subset of quantiles, scales well to high dimensions, and accounts for estimation uncertainty. While the model is arbitrarily flexible, interpretable marginal quantile effects are estimated using accumulative local effect plots and variable importance measures. A simulation study shows that our model can better recover quantiles of the response distribution when the data are sparse, and an analysis of birth weight data is presented.
{© 2021 The International Biometric Society.}

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

BSquare

References:

[1] Abrahamowicz, M., Clampl, A. and Ramsay, J. O. (1992) Nonparametric density estimation for censored survival data: regression‐spline approach. Canadian Journal of Statistics, 20, 171-185. · Zbl 0754.62017
[2] Abrevaya, J. (2001) The effects of demographics and maternal behavior on the distribution of birth outcomes. Empirical Economics, 26, 247-257.
[3] Apley, D. W. and Zhu, J. (2020) Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society. Series B, Statistical Methodology, 82, 1059-1086. · Zbl 07554784
[4] Beatson, R. (1982) Restricted range approximation by splines and variational inequalities. SIAM Journal on Numerical Analysis, 19, 372-380. · Zbl 0491.41011
[5] Bishop, C. M. (1995) Neural Networks for Pattern Recognition. Oxford, UK: Oxford University Press.
[6] Bondell, H. D., Reich, B. J. and Wang, H. (2010) Noncrossing quantile regression curve estimation. Biometrika, 97, 825-838. · Zbl 1204.62061
[7] Cannon, A. J. (2018) Non‐crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes. Stochastic Environmental Research and Risk Assessment, 32, 3207-3225.
[8] Chui, C., Smith, P. and Ward, J. (1980) Degree of \(L_p\) Approximation by Monotone Splines. SIAM Journal on Mathematical Analysis, 11, 436-447. · Zbl 0456.41013
[9] Das, P. and Ghosal, S. (2018) Bayesian non‐parametric simultaneous quantile regression for complete and grid data. Computational Statistics and Data Analysis, 127, 172-186. · Zbl 1469.62051
[10] Gelman, A. (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1, 515-534. · Zbl 1331.62139
[11] Greenwell, B. M., Boehmke, B. C. and McCarthy, A. J. (2018) A simple and effective model‐based variable importance measure. arXiv preprint arXiv:1805.04755.
[12] He, X. (1997) Quantile curves without crossing. American Statistician, 51, 186-192.
[13] Hoffman, M. D. and Gelman, A. (2014) The No‐U‐Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593-1623. · Zbl 1319.60150
[14] Holmes, M. P., Gray, A. G., & Isbell, C. L. (2012) Fast nonparametric conditional density estimation. arXiv preprint arXiv:1206.5278.
[15] Hornik, K., Stinchcombe, M., White, H., et al. (1989) Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366. · Zbl 1383.92015
[16] Izbicki, R. and Lee, A. B. (2016) Nonparametric conditional density estimation in a high‐dimensional regression setting. Journal of Computational and Graphical Statistics, 25, 1297-1316.
[17] Kim, T., Fakoor, R., Mueller, J., Smola, A. J. and Tibshirani, R. J. (2021) Deep quantile aggregation. arXiv preprint arXiv:2103.00083.
[18] Li, R., Reich, B. J. and Bondell, H. D. (2021) Deep distribution regression. Computational Statistics & Data Analysis, 159, 107203. · Zbl 1510.62059
[19] Liu, Y. and Wu, Y. (2011) Simultaneous multiple non‐crossing quantile regression estimation using kernel constraints. Journal of Nonparametric Statistics, 23, 415-437. · Zbl 1359.62108
[20] MacKay, D. J. (1992) A practical Bayesian framework for backpropagation networks. Neural Computation, 4, 448-472.
[21] National Center for Health Statistics (2019) 2019 Natality. data retrieved from Centers for Disease Control and Prevention. Available at: https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/DVS/natality/. Accessed September 27, 2021.
[22] Neal, R. M. (1993) Bayesian learning via stochastic dynamics. Advances in Neural Information Processing Systems, 5, 475-482.
[23] Ngwira, A. and Stanley, C. C. (2015) Determinants of low birth weight in Malawi: Bayesian geo‐additive modelling. PloS One, 10, e0130057.
[24] Reich, B. J. and Smith, L. B. (2013) Bayesian quantile regression for censored data. Biometrics, 69, 651-660. · Zbl 1418.62170
[25] Ribeiro, M. T., Singh, S. and Guestrin, C. (2016) Model‐agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386.
[26] Smith, L. B., Reich, B. J., Herring, A. H., Langlois, P. H. and Fuentes, M. (2015) Multilevel quantile function modeling with application to birth outcomes. Biometrics, 71, 508-519. · Zbl 1390.62311
[27] Tokdar, S. T. and Kadane, J. B. (2012) Simultaneous linear quantile regression: a semiparametric Bayesian approach. Bayesian Analysis, 7, 51-72. · Zbl 1330.62193
[28] Vehtari, A., Gelman, A. and Gabry, J. (2017) Practical Bayesian model evaluation using leave‐one‐out cross‐validation and WAIC. Statistics and Computing, 27, 1413-1432. · Zbl 1505.62408
[29] Watanabe, S. (2013) A widely applicable Bayesian information criterion. Journal of Machine Learning Research, 14, 867-897. · Zbl 1320.62058
[30] Xu, S. G. (2021) 2019 U.S. Birth Weight. Open Science Framework. Available at: https://doi.org/10.17605/OSF.IO/3GFHE. Accessed September 27, 2021. · doi:10.17605/OSF.IO/3GFHE
[31] Yang, Y. and Tokdar, S. T. (2017) Joint estimation of quantile planes over arbitrary predictor spaces. Journal of the American Statistical Association, 112, 1107-1120.
[32] Yuan, Y., Chen, N. and Zhou, S. (2017) Modeling regression quantile process using monotone B‐splines. Technometrics, 59, 338-350.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.