×

Heteroscedastic BART via multiplicative regression trees. (English) Zbl 07499266

Summary: Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

References:

[1] Albert, J. H.; Chib, S., “Bayesian Analysis of Binary and Polychotomous Response Data,”, Journal of the American Statistical Association, 88, 669-679 (1993) · Zbl 0774.62031 · doi:10.1080/01621459.1993.10476321
[2] Allen, G.; Grosenick, L.; Taylor, J., “A Generalized Least-Square Matrix Decomposition,”, Journal of the American Statistical Association, 109, 145-159 (2013) · Zbl 1367.62184 · doi:10.1080/01621459.2013.852978
[3] Bertin-Mahieux, T., “Million Song Dataset,” (2019)
[4] Bertin-Mahieux, T.; Ellis, D. P. W.; Whitman, B.; Lamere, P., The Million Song Dataset,”, 12th International Conference on Music Information Retrieval (ISMIR), 2, 10 (2011)
[5] Bleich, J., and Kapelner, A. (2014), “Bayesian Additive Regression Trees With Parametric Models of Heteroskedasticity,” arXiv no. 1402.5397v1, pp. 1-20.
[6] Box, G. E.; Cox, D. R., “An Analysis of Transformations,”, Journal of the Royal Statistical Society, Series B, 26, 211-243 (1964) · Zbl 0156.40104 · doi:10.1111/j.2517-6161.1964.tb00553.x
[7] Breiman, L., “Random Forests,”, Machine Learning, 45, 5-32 (2001) · Zbl 1007.68152 · doi:10.1023/A:1010933404324
[8] Carroll, R. J., Transformation and Weighting in Regression (1988), New York: Chapman and Hall, New York · Zbl 0666.62062
[9] Chipman, H.; George, E.; McCulloch, R., “Bayesian CART Model Search,”, Journal of the American Statistical Association, 93, 935-960 (1998) · doi:10.1080/01621459.1998.10473750
[10] Chipman, H.; George, E.; McCulloch, R., “Bayesian Treed Models,”, Machine Learning, 48, 299-320 (2002) · Zbl 0998.68072 · doi:10.1023/A:1013916107446
[11] Chipman, H.; George, E.; McCulloch, R., “BART: Bayesian Additive Regression Trees,”, The Annals of Applied Statistics, 4, 266-298 (2010) · Zbl 1189.62066 · doi:10.1214/09-AOAS285
[12] Cook, R. D., “Fisher Lecture: Dimension Reduction in Regresssion,”, Statistical Science, 22, 1-26 (2007) · Zbl 1246.62148 · doi:10.1214/088342306000000682
[13] Daye, Z. J.; Chen, J.; Li, H., “High-Dimensional Heteroscedastic Regression With an Application to eQTL Data Analysis,”, Biometrics, 68, 316-326 (2012) · Zbl 1241.62152 · doi:10.1111/j.1541-0420.2011.01652.x
[14] Denison, D.; Mallick, B.; Smith, A., “A Bayesian CART Algorithm,”, Biometrika, 85, 363-377 (1998) · Zbl 1048.62502
[15] Freidman, J. H., Greedy Function Approximation: A Gradient Boosting Machine,”, Annals of Statistics, 28, 1189-1232 (2001) · Zbl 1043.62034
[16] Freund, Y.; Schapire, R. E., “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,”, Journal of Computer and System Sciences, 55, 119-139 (1997) · Zbl 0880.68103 · doi:10.1006/jcss.1997.1504
[17] Gneiting, T.; Raftery, A. E., “Strictly Proper Scoring Rules, Prediction, and Estimation,”, Journal of the American Statistical Association, 102, 359-378 (2007) · Zbl 1284.62093 · doi:10.1198/016214506000001437
[18] Goldberg, P. W.; Williams, C. K. I.; Bishop, C. M., Regression With Input-Dependent Noise: A Gaussian Process Treatment, Advances in Neural Information Processing Systems, 493-499 (1998)
[19] Gramacy, R.; Lee, H., “Bayesian Treed Gaussian Process Models With an Application to Computer Modeling,”, Journal of the American Statistical Association, 103, 1119-1130 (2008) · Zbl 1205.62218 · doi:10.1198/016214508000000689
[20] Koenker, R., ), Quantile Regression (2005), Cambridge: Cambridge University Press, Cambridge · Zbl 1111.62037
[21] Koenker, R.; Bassett, G. W., “Regression Quantiles,”, Econometrica, 49, 33-50 (1978) · Zbl 0373.62038 · doi:10.2307/1913643
[22] Langford, J., Li, L., and Strehl, A. L. (2007), “Vowpal Wabbit (Fast Online Learning).”
[23] McCulloch, R.; Pratola, M. T.; Chipman, H., ), “rbart: Bayesian Trees for Conditional Mean and Variance,” (2019) · Zbl 07499266 · doi:10.1080/10618600.2019.1677243
[24] Murray, J. S. (2017), “Log-Linear Bayesian Additive Regression Trees for Categorical and Count Responses,” arXiv no. 1701.01503.
[25] Pratola, M. T., “Efficient Metropolis-Hastings Proposal Mechanisms for Bayesian Regression Tree Models,”, Bayesian Analysis, 11, 885-911 (2016) · Zbl 1357.62178 · doi:10.1214/16-BA999
[26] R Core Team, R: A Language and Environment for Statistical Computing (2019), Vienna, Austria: R Foundation for Statistical Computing, Vienna, Austria
[27] Rizzo, M.; Szekely, G., “energy: E-Statistics: Multivariate Inference via the Energy of Data,” (2018)
[28] Rockova, V., and van der Pas, S. (2017), “Posterior Concentration for Bayesian Regression Trees and Their Ensembles,” arXiv no. 1708.08734. · Zbl 1459.62057
[29] Sang, H.; Huang, J., “A Full Scale Approximation of Covariance Functions for Large Spatial Data Sets,”, Journal of the Royal Statistical Society, Series B, 74, 111-132 (2012) · Zbl 1411.62274 · doi:10.1111/j.1467-9868.2011.01007.x
[30] Székely, G. J.; Rizzo, M. L., “Testing for Equal Distributions in High Dimension,”, InterStat, 5, 1-6 (2004)
[31] Taddy, M.; Gramacy, R.; Polson, N., “Dynamic Trees for Learning and Design,”, Journal of the American Statistical Association, 106, 109-123 (2011) · Zbl 1396.62158 · doi:10.1198/jasa.2011.ap09769
[32] Tibshirani, R., “Regression Shrinkage and Selection via the Lasso,”, Journal of the Royal Statistical Society, Series B, 58, 267-288 (1996) · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x
[33] Yeo, I. K.; Johnson, R. A., “A New Family of Power Transformations to Improve Normality or Symmetry,”, Biometrika, 87, 954-959 (2000) · Zbl 1028.62010 · doi:10.1093/biomet/87.4.954
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.