×

A robust mixed-effects parametric quantile regression model for continuous proportions: quantifying the constraints to vitality in cushion plants. (English) Zbl 07778729

Summary: There is no literature on outlier-robust parametric mixed-effects quantile regression models for continuous proportion data as an alternative to systematically identifying and eliminating outliers. To fill this gap, we formulate a robust method by extending the recently proposed fixed-effects quantile regression model based on the heavy-tailed Johnson-\(t\) distribution for continuous proportion data to the mixed-effects modeling context, using a Bayesian approach. Our proposed method is motivated by and used to model the extreme quantiles of the vitality of cushion plants to provide insights into the ecology of the system in which the plants are dominant. We conducted a simulation study to assess the new method’s performance and robustness to outliers. We show that the new model has good accuracy and confidence interval coverage properties and is remarkably robust to outliers. In contrast, our study demonstrates that the current approach in the literature for modeling hierarchically structured bounded data’s quantiles is susceptible to outliers, especially when modeling the extreme quantiles. We conclude that the proposed model is an appropriate robust alternative to the current approach for modeling the quantiles of correlated continuous proportions when outliers are present in the data.
© 2023 The Authors. Statistica Neerlandica published by John Wiley & Sons Ltd on behalf of Netherlands Society for Statistics and Operations Research.

MSC:

62Fxx Parametric inference
62Jxx Linear inference, regression
62Exx Statistical distribution theory

References:

[1] Bayes, C. L., Bazán, J. L., & deCastro, M. (2017). A quantile parametric mixed regression model for bounded response variables. Statistics and Its Interface, 10(3), 483-493. · Zbl 1388.62200
[2] Bayes, C. L., Bazán, J. L., & García, C. (2012). A new robust regression model for proportions. Bayesian Analysis, 7(4), 841-866. · Zbl 1330.62272
[3] Begashaw, G. B., & Yohannes, Y. B. (2020). Review of outlier detection and identifying using robust regression model. International Journal of Systems Science and Applied Mathematics, 5(1), 4.
[4] Benhadi‐Marín, J. (2018). A conceptual framework to deal with outliers in ecology. Biodiversity and Conservation, 27(12), 3295-3300.
[5] Brooks, M. E., Kristensen, K., vanBenthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., … Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero‐inflated generalized linear mixed modeling. The R Journal, 9(2), 378-400.
[6] Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434-455. https://doi.org/10.1080/10618600.1998.10474787 · doi:10.1080/10618600.1998.10474787
[7] Burger, D. A., & Lesaffre, E. (2021). Nonlinear mixed‐effects modeling of longitudinal count data: Bayesian inference about median counts based on the marginal zero‐inflated discrete Weibull distribution. Statistics in Medicine, 40(23), 5078-5095.
[8] Burger, D. A., Schall, R., Ferreira, J. T., & Chen, D.‐G. (2020). A robust Bayesian mixed effects approach for zero inflated and highly skewed longitudinal count data emanating from the zero inflated discrete Weibull distribution. Statistics in Medicine, 39(9), 1275-1291. https://doi.org/10.1002/sim.8475 · doi:10.1002/sim.8475
[9] Cade, B. S., & Noon, B. R. (2003). A gentle introduction to quantile regression for ecologists. Frontiers in Ecology and the Environment, 1(8), 412-420.
[10] Cancho, V. G., Bazán, J. L., & Dey, D. K. (2020). A new class of regression model for a bounded response with application in the study of the incidence rate of colorectal cancer. Statistical Methods in Medical Research, 29(7), 2015-2033.
[11] Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., … Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1-32.
[12] Denwood, M. J. (2016). Runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. Journal of Statistical Software, 71(9), 1-25.
[13] diBrisco, A. M., & Migliorati, S. (2020). A new mixed‐effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129-145.
[14] Dunn, P. K., & Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5(3), 236-244.
[15] Ferrari, S., & Cribari‐Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799-815. · Zbl 1121.62367
[16] Flores, S. E., Prates, M. O., Bazán, J. L., & Bolfarine, H. B. (2021). Spatial regression models for bounded response variables with evaluation of the degree of dependence. Statistics and Its Interface, 14(2), 95-107. · Zbl 07342183
[17] Fonseca, T. C. O., Ferreira, M. A. R., & Migon, H. S. (2008). Objective Bayesian analysis for the student‐
[( t \]\) regression model. Biometrika, 95(2), 325-333. · Zbl 1400.62260
[18] Galarza, C. E., Zhang, P., & Lachos, V. H. (2020). Logistic quantile regression for bounded outcomes using a family of heavy‐tailed distributions. Sankhya B, 83, 325-349. · Zbl 1493.62522
[19] Gelfand, A. E., & Smith, A. F. M. (1990). Sampling‐based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398-409. https://doi.org/10.1080/01621459.1990.10476213 · Zbl 0702.62020 · doi:10.1080/01621459.1990.10476213
[20] Gelman, A., & Hill, J. (Eds.). (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge, UK: Cambridge University Press.
[21] Goddard, K. A., Craig, K. J., Schoombie, J., & leRoux, P. C. (2022). Investigation of ecologically relevant wind patterns on Marion Island using computational fluid dynamics and measured data. Ecological Modelling, 464, 109827.
[22] Hartig, F. (2021a). DHARMa: residual diagnostics for hierarchical (multi‐level/mixed) regression models. Retrieved from. https://cran.r‐project.org/web/packages/DHARMa/vignettes/DHARMa.html
[23] Hartig, F. (2021b). DHARMa: Residual diagnostics for hierarchical (multi‐level/mixed) regression models. R Package Version 0.4.4.
[24] Huang, A., & Wand, M. P. (2013). Simple marginally noninformative prior distributions for covariance matrices. Bayesian Analysis, 8(2), 439-452. https://doi.org/10.1214/13‐BA815 · Zbl 1329.62135 · doi:10.1214/13‐BA815
[25] Huntley, B. J. (1972). Notes on the ecology of Azorella selago hook. f. Journal of South African Botany, 38, 103-113.
[26] Johnson, N. L. (1949). Systems of frequency curves generated by methods of translation. Biometrika, 36, 149-176. · Zbl 0033.07204
[27] Juárez, M. A., & Steel, M. F. J. (2010). Model‐based clustering of non‐Gaussian panel data based on skew‐t distributions. Journal of Business & Economic Statistics, 28(1), 52-66. · Zbl 1198.62097
[28] Kellner, K. (2021). jagsUI: A wrapper around rjags to streamline JAGS analyses. R Package Version, 1(5), 2.
[29] Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33-50. · Zbl 0373.62038
[30] Koenker, R., Portnoy, S., Ng, P. T., Melly, B., Zeileis, A., Grosjean, P., … Ripley, B. D. (2021). Quantreg: Quantile regression. R package version 5.86.
[31] Kwak, S. K., & Kim, J. H. (2017). Statistical data preparation: Management of missing values and outliers. Korean Journal of Anesthesiology, 70(4), 407.
[32] Lange, K. L., Little, R. J. A., & Taylor, J. M. G. (1989). Robust statistical modeling using the
[( t \]\) distribution. Journal of the American Statistical Association, 84(408), 881-896.
[33] leRoux, P. C. (2008). Climate and climate change. In S. L.Chown (ed.) & P. W.Froneman (ed.) (Eds.), The Prince Edward islands: Land‐Sea interactions in a changing ecosystem (pp. 39-64). Stellenbosch, South Africa: African SunMedia.
[34] leRoux, P. C., McGeoch, M. A., Nyakatya, M. J., & Chown, S. L. (2005). Effects of a short‐term climate change experiment on a sub‐Antarctic keystone plant species. Global Change Biology, 11(10), 1628-1639.
[35] leRoux, P. C., Shaw, J. D., & Chown, S. L. (2013). Ontogenetic shifts in plant interactions vary with environmental severity and affect population structure. New Phytologist, 200(1), 241-250.
[36] Lemonte, A. J., & Moreno‐Arenas, G. (2020). On a heavy‐tailed parametric quantile regression model for limited range response variables. Computational Statistics, 35(1), 379-398. · Zbl 1505.62243
[37] Leys, C., Klein, O., Dominicy, Y., & Ley, C. (2018). Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance. Journal of Experimental Social Psychology, 74, 150-156.
[38] Lindgren, F., & Rue, H. (2015). Bayesian spatial modelling with R‐INLA. Journal of Statistical Software, 63(1), 1-25.
[39] Mazucheli, J., Leiva, V., Alves, B., & Menezes, A. F. B. (2021). A new quantile regression for modeling bounded data under a unit Birnbaum-Saunders distribution with applications in medicine and politics. Symmetry, 13(4), 682.
[40] Mazucheli, J., Menezes, A. F. B., Fernandes, L. B., deOliveira, R. P., & Ghitany, M. E. (2020). The unit‐Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates. Journal of Applied Statistics, 47(6), 954-974. · Zbl 1521.62406
[41] McGeoch, M. A., leRoux, P. C., Hugo, E. A., & Nyakatya, M. J. (2008). Spatial variation in the terrestrial biotic system. In S. L.Chown (ed.) & P. W.Froneman (ed.) (Eds.), The Prince Edward islands: Land‐Sea interactions in a changing ecosystem (pp. 245-276). Stellenbosch, South Africa: African SunMedia.
[42] Migliorati, S., diBrisco, A. M., & Ongaro, A. (2018). A new regression model for bounded responses. Bayesian Analysis, 13(3), 845-872. · Zbl 1407.62279
[43] Min, I., & Kim, I. (2004). A Monte Carlo comparison of parametric and nonparametric quantile regressions. Applied Economics Letters, 11(2), 71-74.
[44] Mitnik, P. A., & Baek, S. (2013). The Kumaraswamy distribution: Median‐dispersion re‐parameterizations for regression modeling and simulation‐based estimation. Statistical Papers, 54(1), 177-192. · Zbl 1257.62013
[45] Nyakatya, M. J., & McGeoch, M. A. (2008). Temperature variation across Marion Island associated with a keystone plant species (Azorella selago Hook.(Apiaceae)). Polar Biology, 31(2), 139-151.
[46] Owen, W. R. (1995). Growth and reproduction in an alpine cushion plant: Astragalus kentrophyta var. implexus. The Great Basin Naturalist, 55, 117-123.
[47] Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. Paper presented at the meeting of the Proceedings of the 3rd International Workshop on Distributed Statistical Computing, Vienna, Austria, 1-10.
[48] Quintero, A., & Lesaffre, E. (2018). Comparing hierarchical models via the marginalized deviance information criterion. Statistics in Medicine, 37(16), 2440-2454. https://doi.org/10.1002/sim.7649 · doi:10.1002/sim.7649
[49] Raath‐Krüger, M. J., Schöb, C., McGeoch, M. A., Burger, D. A., Strydom, T., & leRoux, P. C. (2022). Long‐term spatially‐replicated data show no cost to a benefactor species in a facilitative plant‐plant interaction. bioRxiv. Retrieved from.
[50] Raath‐Krüger, M. J., Schöb, C., McGeoch, M. A., & leRoux, P. C. (2021). Interspecific facilitation mediates the outcome of intraspecific interactions across an elevational gradient. Ecology, 102(1), e03200.
[51] Simpson, D., Rue, H., Riebler, A., Martins, T. G., & Sørbye, S. H. (2017). Penalising model component complexity: A principled, practical approach to constructing priors. Statistical Science, 32(1), 1-28. · Zbl 1442.62060
[52] Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van derLinde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B, 64(4), 583-639. https://doi.org/10.1111/1467-9868.00353 · doi:10.1111/1467-9868.00353
[53] Tomazella, V. L. D., Jesus, S. R., Gazon, A. B., Louzada, F., Nadarajah, S., Nascimento, D. C., … Ramos, P. L. (2021). Bayesian reference analysis for the generalized normal linear regression model. Symmetry, 13(5), 856.
[54] Wang, J., & Luo, S. (2016). Augmented Beta rectangular regression models: A Bayesian perspective. Biometrical Journal, 58(1), 206-221. · Zbl 1403.62215
[55] Wei, Y., Kehm, R. D., Goldberg, M., & Terry, M. B. (2019). Applications for quantile regression in epidemiology. Current Epidemiology Reports, 6(2), 191-199.
[56] Yirga, A. A., Melesse, S. F., Mwambi, H. G., & Ayele, D. G. (2021). Additive quantile mixed effects modelling with application to longitudinal CD4 count data. Scientific Reports, 11(1), 1-12.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.