×

A practical, effective calculation of gamma difference distributions with open data science tools. (English) Zbl 07551351

Summary: At present, there is still no officially accepted and extensively verified implementation of computing the gamma difference distribution allowing unequal shape parameters. We explore four computational ways of the gamma difference distribution with different shape parameters resulting from time series kriging, a forecasting approach based on the best linear unbiased prediction and linear mixed models. The results of our numerical study, with emphasis on using open data science tools, demonstrate that our computational algorithm implemented in high-performance Python (with Numba) is exponentially fast, highly accurate and very reliable. It combines numerical inversion of the characteristic function and the trapezoidal rule with the double exponential oscillatory transformation (DE quadrature). At the double 53-bit precision, our open tool outperformed the speed of the analytical computation based on Tricomi’s \(U(a, b, z)\) function in CAS software (commercial Mathematica, open SageMath) by 1.5–2 orders. At the default precision of scientific numerical computational tools, it exceeded open SciPy, NumPy and commercial MATLAB 5–10 times. The potential future application of our tool for a mixture of characteristic functions could open new possibilities for fast data analysis based on exact probability distributions in areas like multidimensional statistics, measurement uncertainty analysis in metrology, as well as in financial mathematics, and risk analysis.

MSC:

62-XX Statistics
65C60 Computational problems in statistics (MSC2010)
62E10 Characterization and structure theory of statistical distributions
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
91B84 Economic time series analysis

References:

[1] Kotz, S.; Kozubowski, T.; Podgorski, K., The Laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance (2001), Boston, MA: Birhäuser, Boston, MA · Zbl 0977.62003
[2] Kozubowski, T, Podgorski, K. Laplace probability distributions and related stochastic processes. In: Shmaliy Y, editor. Probability: interpretation, theory and applications. New York: Nova Science Publishers, Inc.; 2012. p. 105-145.
[3] Nestler, S.; Hall, A., The variance gamma distribution, Significance, 16, 5, 10-11 (2019)
[4] Sheraz, M.; Dedu, S., Bitcoin cash: stochastic models of fat-tail returns and risk modeling, Econom Comput Econ Cybernet Studies Res, 54, 3, 43-58 (2020)
[5] Klar, B., A note on gamma difference distributions, J Stat Comput Simul, 85, 18, 3708-3715 (2015) · Zbl 1510.62117
[6] Hendrickson, AJ., Centralized inverse-Fano distribution for controlling conversion gain measurement accuracy of detector elements, JOSA A, 34, 8, 1411-1423 (2017)
[7] Ranney, K, Tom, K, Tadas, D, et al. An efficient pulse detector and pulse width estimator for waveform characterization. Radar sensor technology XXV; Vol. 11742, International Society for Optics and Photonics; 2021. p. 117421E.
[8] Khan, TA; Heath Jr, RW; Popovski, P., Wirelessly powered communication networks with short packets, IEEE Trans Commun, 65, 12, 5529-5543 (2017)
[9] Sekhavati, F. Dynamic response of individual cells in heterogeneous population [Ph.D. Thesis]. München: LMU München; 2015.
[10] Belghith, A.; Bowd, C.; Medeiros, FA, Learning from healthy and stable eyes: a new approach for detection of glaucomatous progression, Artif Intell Med, 64, 2, 105-115 (2015)
[11] López, SI; Pimentel, LPR., Geodesic forests in last-passage percolation, Stoch Process Their Appl, 127, 1, 304-324 (2017) · Zbl 1375.60136
[12] Dial, R.; Chaussé, P.; Allgeier, M., Estimating net primary productivity (NPP) and debris-Fall in forests using lidar time series, Remote Sensing., 13, 5, 891 (2021)
[13] Lowndes, JSS; Best, BD; Scarborough, C., Our path to better science in less time using open data science tools, Nature Ecol Evol, 1, 6, 1-7 (2017)
[14] Chambers, M.; Doig, C.; Stokes-Rees, I., Breaking data science open: how open data science is eating the world (2017), Boston: O’Reilly Media, Boston
[15] Gajdoš, A.; Hančová, M.; Hanč, J., Kriging methodology and its development in forecasting econometric time series, Statistika, 97, 1, 59-73 (2017)
[16] Hančová, M.; Gajdoš, A.; Hanč, J., Estimating variances in time series Kriging using convex optimization and empirical BLUPs, Statist Papers, 62, 4, 1899-1938 (2021) · Zbl 1477.62243
[17] Mittelhammer, RC., Mathematical statistics for economics and business (2013), New York: Springer Science & Business Media, New York · Zbl 1263.62140
[18] Shorack, GR., Probability for statisticians (2017), New York: Springer, New York · Zbl 1377.62009
[19] Hendrickson, A., The inverse gamma-difference distribution and its first moment in the Cauchy principal value sense, Stat Interface, 12, 3, 467-478 (2019) · Zbl 1502.60015
[20] DLMF. NIST digital library of mathematical functions. In: Olver FWJ, Olde Daalhuis AB, Lozier DW, et al., editors. Gaithersburg, Boulder: NIST; 2021. Release 1.1.1 of 2021-03-15. Available from: http://dlmf.nist.gov/.
[21] Abramowitz, M.; Stegun, I., Handbook of mathematical functions with formulas, graphs, and mathematical tables (2014), Mansfield Centre: Martino Fine Books, Mansfield Centre
[22] Gradshteyn, IS; Ryzhik, IM., Table of integrals, series, and products (2007), New York: Elsevier Acad. Press, New York · Zbl 1208.65001
[23] Mathai, AM., On noncentral generalized Laplacianness of quadratic forms in normal variables, J Multivar Anal, 45, 2, 239-246 (1993) · Zbl 0783.62041
[24] Pearson, JW; Olver, S.; Porter, MA., Numerical methods for the computation of the confluent and Gauss hypergeometric functions, Numer Algor, 74, 3, 821-866 (2017) · Zbl 1360.33009
[25] Gil, A.; Segura, J.; Temme, NM., Numerical methods for special functions (2007), Philadelphia, PA: SIAM, Philadelphia, PA · Zbl 1144.65016
[26] Johansson, F., Computing hypergeometric functions rigorously, ACM Trans Math Softw, 45, 3, 30:1-30:26 (2019) · Zbl 1486.65026
[27] Byatt-Smith, JG., The Borel transform and its use in the summation of asymptotic expansions, Studies Appl Math, 103, 4, 339-369 (1999) · Zbl 1001.34045
[28] Misra, UK. An Introduction to Summability Methods. In: Dutta H, Rhoades BE, editors. Current Topics in Summability Theory and Applications. Singapore: Springer; 2016. p. 1-27. · Zbl 1360.40003
[29] Turner, PR; Arildsen, T.; Kavanagh, K., Applied scientific computing: with Python (2018), New York: Springer, New York · Zbl 1411.65004
[30] Severini, TA., Elements of distribution theory (2011), Cambridge: Cambridge University Press, Cambridge
[31] Gil-Pelaez, J., Note on the inversion theorem, Biometrika, 38, 3-4, 481-482 (1951) · Zbl 0045.07204
[32] Davies, RB., Numerical inversion of a characteristic function, Biometrika, 60, 2, 415-417 (1973) · Zbl 0263.65115
[33] Waller, LA; Turnbull, BW; Hardin, JM., Obtaining distribution functions by numerical inversion of characteristic functions with applications, Amer Statist, 49, 4, 346 (1995)
[34] Witkovsky, V., Numerical inversion of a characteristic function: an alternative tool to form the probability distribution of output quantity in linear measurement models, Acta IMEKO, 5, 3, 32-44 (2016)
[35] Gajdoš, A, Hanč, J, Hančová, M. fdslrm ; 2019. Available from: https://github.com/fdslrm.
[36] Inc WR. Mathematica online, Version 12.3 ; 2021. Champaign, IL; Available from: https://www.wolfram.com/mathematica.
[37] Jones, E, Oliphant, T, Peterson, P, et al. SciPy: Open source scientific tools for Python. 2001. Available from: http://www.scipy.org/.
[38] Van Rossum, G.; Drake, FL., Python 3 reference manual (2009), Scotts Valley, CA: CreateSpace, Scotts Valley, CA
[39] R Core Team. R. A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. Available from: https://www.R-project.org/.
[40] Kluyver, T, Ragan-Kelley, B, Perez, F, et al. Jupyter Notebooks-a publishing format for reproducible computational workflows. In: Loizides F, Schmidt B, editors. Positioning and Power in Academic Publishing: Players, Agents and Agendas. Proceedings of the 20th ELPUB. Amsterdam: Ios Press; 2016. p. 87-90.
[41] RStudio Team. Rstudio: Integrated development environment for r. Boston, MA: RStudio, PBC; 2020. Available from: http://www.rstudio.com/.
[42] Frederickson, B. Ranking programming languages by GitHub users; 2019. Available from: https://www.benfrederickson.com/ranking-programming-languages-by-github-users/.
[43] Kaggle. Kaggle’s state of data science and machine learning 2019, Enterprise executive summary; 2020. Available from: https://www.kaggle.com/kaggle-survey-2019.
[44] The Mathworks, Inc. Natick, Massachusetts. MATLAB (R2021a); 2021. Available from: https://www.mathworks.com/.
[45] Trefethen, LN; Weideman, JAC., The exponentially convergent trapezoidal rule, SIAM Rev, 56, 3, 385-458 (2014) · Zbl 1307.65031
[46] Takahasi, H.; Mori, M., Double exponential formulas for numerical integration, Publ Res Inst Math Sci, 9, 3, 721-741 (1974) · Zbl 0293.65011
[47] Mori, M., Discovery of the double exponential transformation and its developments, Publ Res Inst Math Sci, 41, 4, 897-935 (2005) · Zbl 1098.41031
[48] Sugihara, M., Optimality of the double exponential formula – functional analysis approach, Numer Math, 75, 3, 379-395 (1997) · Zbl 0868.41019
[49] Mori, M.; Sugihara, M., The double-exponential transformation in numerical analysis, J Comput Appl Math, 127, 1-2, 287-296 (2001) · Zbl 0971.65015
[50] Witkovský, V. witkovsky/CharFunTool; 2021. Available from: https://github.com/witkovsky/CharFunTool.
[51] Ooura, T.; Mori, M., A robust double exponential formula for Fourier-type integrals, J Comput Appl Math, 112, 1, 229-241 (1999) · Zbl 0947.65149
[52] Ooura, T., A double exponential formula for the Fourier transforms, Publ Res Inst Math Sci, 41, 4, 971-977 (2005) · Zbl 1100.65131
[53] Ooura, T. Ooura’s Mathematical Software Packages; 2006. Available from: https://www.kurims.kyoto-u.ac.jp/ooura/index.html.
[54] Gajdoš, A, Hanč, J, Hančová, M. fdslrm/GDD; 2021. Available from: https://github.com/fdslrm/GDD.
[55] Boulle, A.; Kieffer, J., High-performance Python for crystallographic computing, J Appl Cryst, 52, 4, 882-897 (2019)
[56] Slevinsky, R.; Olver, S., On the use of conformal maps for the acceleration of convergence of the trapezoidal rule and sinc numerical methods, SIAM J Sci Comput, 37, 2, A676-A700 (2015) · Zbl 1317.30011
[57] Slevinsky, M. MikaelSlevinsky/DEQuadrature.jl; 2020. Available from: https://github.com/MikaelSlevinsky/DEQuadrature.jl.
[58] Stenger, F., Handbook of Sinc numerical methods (2010), Boca Raton, FL: CRC Press, Boca Raton, FL
[59] Asheim, A.; Huybrechs, D., Complex Gaussian quadrature for oscillatory integral transforms, IMA J Numer Anal, 33, 4, 1322-1341 (2013) · Zbl 1279.65139
[60] Deaño, A.; Huybrechs, D.; Iserles, A., Computing highly oscillatory integrals (2017), Philadelphia: Society for Industrial and Applied Mathematics, Philadelphia
[61] Ley, C.; Bordas, SPA., What makes data science different? A discussion involving statistics 2.0 and computational sciences, Int J Data Sci Anal, 6, 3, 167-175 (2018)
[62] Brodie, ML. What is data science? In: Braschler M, Stadelmann T, Stockinger K, editors. Applied Data Science: Lessons Learned for the Data-Driven Business. Cham: Springer International Publishing; 2019. p. 101-130.
[63] Lovrod, J.; Safouhi, H., Double exponential transformation for computing three-center nuclear attraction integrals, Mol Phys, 0, 1-12 (2019)
[64] Gajdoš, A, Hanč, J, Hančová, M. fdslrm: applications; 2019. Available from: https://github.com/fdslrm/applications.
[65] Searle, SR; Khuri, AI., Matrix algebra useful for statistics (2017), Hoboken: Wiley, Hoboken · Zbl 1365.62004
[66] Hančová, M., Natural estimation of variances in a general finite discrete spectrum linear regression model, Metrika, 67, 3, 265-276 (2008) · Zbl 1357.62275
[67] Stein, WA. Sage Mathematics Software - SageMath; 2020. Available from http://www.sagemath.org.
[68] Eaton, JW. Gnu octave manual. Network Theory Limited; 2002. Available from http://www.octave.org.
[69] Harris, CR; Millman, KJ; van der Walt, SJ, Array programming with NumPy, Nature, 585, 7825, 357-362 (2020)
[70] Galassi, M, Davies, J, Theiler, J, et al. GNU scientific library: Reference manual. Network Theory; 2009. Available from https://www.gnu.org/software/gsl/.
[71] The PARI Group. Univ. Bordeaux. PARI/GP version 2.11.2; 2019. Available from http://pari.math.u-bordeaux.fr/.
[72] Johansson, F. mpmath: a Python library for arbitrary-precision floating-point arithmetic (version 0.18). 2013. Available from http://mpmath.org/.
[73] Johansson, F., Arb: a C library for ball arithmetic, ACM Commun Comput Algebra, 47, 3-4, 166-169 (2014)
[74] Behnel, S.; Bradshaw, R.; Citro, C., Cython: the best of both worlds, Comput Sci Eng, 13, 2, 31-39 (2011)
[75] Lam, SK, Pitrou, A, Seibert, S. Numba: A LLVM-based Python JIT compiler. Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC; 2015. p. 1-6. Available from: https://numba.pydata.org/.
[76] Zimmermann, P.; Casamayou, A.; Cohen, N., Computational mathematics with SageMath (2018), Philadelphia: SIAM, Philadelphia
[77] Maplesoft, a division of Waterloo Maple Inc. Maple. 2019. Available from: https://www.maplesoft.com/.
[78] Štulajter, F.; Witkovský, V., Estimation of variances in orthogonal finite discrete spectrum linear regression models, Metrika, 60, 2, 105-118 (2004) · Zbl 1083.62079
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.