×

Heavy-tailed density estimation. (English) Zbl 07820373

Summary: A novel statistical method is proposed and investigated for estimating a heavy tailed density under mild smoothness assumptions. Statistical analyses of heavy-tailed distributions are susceptible to the problem of sparse information in the tail of the distribution getting washed away by unrelated features of a hefty bulk. The proposed Bayesian method avoids this problem by incorporating smoothness and tail regularization through a carefully specified semiparametric prior distribution, and is able to consistently estimate both the density function and its tail index at near minimax optimal rates of contraction. A joint, likelihood driven estimation of the bulk and the tail is shown to help improve uncertainty assessment in estimating the tail index parameter and offer more accurate and reliable estimates of the high tail quantiles compared to thresholding methods. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

References:

[1] Adler, R. J., and Taylor, J. E. (2009), Random Fields and Geometry, New York: Springer.
[2] Alves, M. F. (2001), “A Location Invariant Hill-Type Estimator,” Extremes, 4, 199-217. · Zbl 1053.62063
[3] Andrieu, C., and Thoms, J. (2008), “A Tutorial on Adaptive MCMC,” Statistics and Computing, 18, 343-373. DOI: .
[4] Balkema, A., and de Haan, L. (1974), “Residual Life Time at Great Age,” Annals of Probability, 2, 792-804. · Zbl 0295.60014
[5] Banerjee, S., Gelfand, A. E., Finley, A. O., and Sang, H. (2008), “Gaussian Predictive Process Models for Large Spatial Data Sets,” Journal of the Royal Statistical Society, Series B, 70, 825-848. DOI: . · Zbl 1533.62065
[6] Beirlant, J., Joossens, E., and Segers, J. (2009), “Second-Order Refined Peaks-Over-Threshold Modelling for Heavy-Tailed Distributions,” Journal of Statistical Planning and Inference, 139, 2800-2815. DOI: . · Zbl 1162.62044
[7] Carpentier, A., and Kim, A. K. (2015), “Adaptive and Minimax Optimal Estimation of the Tail Coefficient,” Statistica Sinica, 25, 1133-1144. DOI: . · Zbl 1415.62029
[8] Castillo, E. (2012), Extreme Value Theory in Engineering, Boston: Elsevier.
[9] de Zea Bermudez, P., and Kotz, S. (2010), “Parameter Estimation of the Generalized Pareto Distribution? Part II,” Journal of Statistical Planning and Inference, 140, 1374-1388. DOI: . · Zbl 1190.62039
[10] Dekkers, A., Einmahl, J., De Haan, L.et al. (1989), “A Moment Estimator for the Index of an Extreme-Value Distribution,”’ The Annals of Statistics, 17, 1833-1855. DOI: . · Zbl 0701.62029
[11] Diaconis, P., and Freedman, D. (1986), “On the Consistency of Bayes Estimates,” The Annals of Statistics, 14, 1-26. · Zbl 0595.62022
[12] do Nascimento, F. F., Gamerman, D., and Lopes, H. F. (2012), “A Semiparametric Bayesian Approach to Extreme Value Estimation,” Statistics and Computing, 22, 661-675. DOI: . · Zbl 1322.62049
[13] Durrieu, G., Grama, I., Pham, Q.-K., and Tricot, J.-M. (2015), “Nonparametric Adaptive Estimation of Conditional Probabilities of Rare Events and Extreme Quantiles,” Extremes, 18, 437-478. DOI: . · Zbl 1327.62323
[14] Embrechts, P., Klüppelberg, C., and Mikosch, T. (2013), Modelling Extremal Events: For Insurance and Finance (Vol. 33), Berlin: Springer.
[15] Ghosal, S., Ghosh, J. K., and Ramamoorthi, R. V. (1999), “Posterior Consistency of Dirichlet Mixtures in Density Estimation,” The Annals of Statistics, 27, 143-158. DOI: . · Zbl 0932.62043
[16] Ghosal, S., and van der Vaart, A. (2017), “Fundamentals of Nonparametric Bayesian Inference (Vol. 44), Cambridge: Cambridge University Press. · Zbl 1376.62004
[17] Gilleland, E., and Katz, R. W. (2011), “New Software to Analyze How Extremes Change Over Time,” Eos, Transactions American Geophysical Union, 92, 13-14. DOI: .
[18] Gu, M., Wang, X., and Berger, J. O. (2018), “Robust Gaussian Stochastic Process Emulation,” The Annals of Statistics, 46, 3038-3066. DOI: . · Zbl 1408.62155
[19] Hall, P., and Welsh, A. (1984), “Best Attainable Rates of Convergence for Estimates of Parameters of Regular Variation,” Annals of Statistics, 12, 1079-1084. · Zbl 0539.62048
[20] Hall, P., and Welsh, A. H. (1985), “Adaptive Estimates of Parameters of Regular Variation,” The Annals of Statistics, 13, 331-341. DOI: . · Zbl 0605.62033
[21] Hill, B. M. (1975), “A Simple General Approach to Inference about the Tail of a Distribution,” Annals of Statistics, 3, 1163-1174. · Zbl 0323.62033
[22] Katz, R. W., Parlange, M. B., and Naveau, P. (2002), “Statistics of Extremes in Hydrology,” Advances in Water Resources, 25, 1287-1304. DOI: .
[23] Kleijn, B. (2021), “Frequentist Validity of Bayesian Limits,” The Annals of Statistics, 49, 182-202. DOI: . · Zbl 1466.62278
[24] Lenk, P. J. (1988), “The Logistic Normal Distribution for Bayesian, Nonparametric, Predictive Densities,” Journal of the American Statistical Association, 83, 509-516. DOI: . · Zbl 0648.62034
[25] Lenk, P. J. (1991), “Towards a Practicable Bayesian Nonparametric Density Estimator,” Biometrika, 78, 531-543. · Zbl 0737.62035
[26] Leonard, T. (1978), “Density Estimation, Stochastic Processes and Prior Information,” Journal of the Royal Statistical Society, Series B, 40, 113-146. DOI: . · Zbl 0398.62033
[27] Li, C., Lin, L., and Dunson, D. B. (2019), “On Posterior Consistency of Tail Index for Bayesian Kernel Mixture Models,” Bernoulli, 25, 1999-2028. DOI: . · Zbl 1466.62268
[28] MacDonald, A., Scarrott, C. J., Lee, D., Darlow, B., Reale, M., and Russell, G. (2011), “A Flexible Extreme Value Mixture Model,” Computational Statistics & Data Analysis, 55, 2137-2157. · Zbl 1328.62296
[29] Markovich, N. (2007), Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice, Chichester: Wiley. · Zbl 1156.62027
[30] Paulo, R. (2005), “Default Priors for Gaussian Processes,” The Annals of Statistics, 33, 556-582. DOI: . · Zbl 1069.62030
[31] Pickands, J. (1975), “Statistical Inference Using Extreme Order Statistics,” Annals of Statistics, 3, 119-131. · Zbl 0312.62038
[32] Rice, S. O. (1944), “Mathematical Analysis of Random Noise,” The Bell System Technical Journal, 23, 282-332. DOI: . · Zbl 0063.06485
[33] Scarrot, C., and MacDonnald, A. (2012), “A Review of Extreme Value Threshold Estimation and Uncertainty Quantification,” Statistical Journal, 103, 33-60. · Zbl 1297.62120
[34] Schwartz, L. (1965), “On Bayes Procedures,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 4, 10-26. DOI: . · Zbl 0158.17606
[35] Snelson, E., and Ghahramani, Z. (2006), “Sparse Gaussian Processes using Pseudo-Inputs,” in Advances in Neural Information Processing Systems, pp. 1257-1264.
[36] Stone, C. J. (1982), “Optimal Global Rates of Convergence for Nonparametric Regression,” The Annals of Statistics, 10, 1040-1053. DOI: . · Zbl 0511.62048
[37] Tancredi, A., Anderson, C., and O’Hagan, A. (2006), “Accounting for Threshold Uncertainty in Extreme Value Estimation,” Extremes, 9, 87-106. DOI: . · Zbl 1164.62326
[38] Tokdar, S. T. (2007), “Towards a Faster Implementation of Density Estimation with Logistic Gaussian Process Priors,” Journal of Computational and Graphical Statistics, 16, 633-655. DOI: .
[39] Tokdar, S. T., and Ghosh, J. K. (2007), “Posterior Consistency of Logistic Gaussian Process Priors in Density Estimation,” Journal of Statistical Planning and Inference, 137, 34-42. DOI: . · Zbl 1098.62041
[40] van der Vaart, A., and van Zanten, J. (2008), “Rates of Contraction of Posterior Distributions based on Gaussian Process Priors,” The Annals of Statistics, 36, 1435-1463. DOI: . · Zbl 1141.60018
[41] van der Vaart, A. W., and van Zanten, J. H. (2009), “Adaptive Bayesian Estimation Using a Gaussian Random Field with Inverse Gamma Bandwidth,” The Annals of Statistics, 37, 2655-2675. DOI: . · Zbl 1173.62021
[42] Yang, Y., and Tokdar, S. T. (2017), “Joint Estimation of Quantile Planes Over Arbitrary Predictor Spaces,” Journal of the American Statistical Association, 112, 1107-1120. DOI: .
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.