×

Flexible regression models for counts with high-inflation of zeros. (English) Zbl 1436.62309

Summary: In this paper, we introduce a flexible class of regression models for counts with high-inflation of zeros that cannot be predicted by the Poisson, the zero-inflated Poisson, the negative binomial and the Poisson-inverse Gaussian regression models. Our proposed flexible regression models are based on a class of zero-inflated mixed Poisson distributions and contain the zero-inflated negative binomial (ZINB) and the zero-inflated Poisson-inverse Gaussian (ZIPIG) distributions, as particular cases, among others. We consider regression structures for the mean, the dispersion, and the zero-inflation parameters. Consequently, we generalize existing models, such as the ZINB regression (with non-varying dispersion), and also open the possibility of introducing new models, such as the ZIPIG and the zero-inflated generalized hyperbolic secant regressions. We propose an Expectation-Maximization (in short EM) algorithm for estimating the parameters and the associated information matrix. Simulation results are presented to compare the finite-sample performance of our proposed EM-algorithm with a direct maximization of the log-likelihood function based on the GAMLSS approach. These simulated results show some advantages of our EM-algorithm concerning the GAMLSS proposal. We also discuss a measure of influence based on the EM approach and propose simulated envelopes for checking the adequacy of our zero-inflated regression models. An empirical application, about the number of roots produced by 270 micropropagated shoots of the columnar apple cultivar Trajan, illustrates the usefulness of the proposed class of regression models for dealing with count data presenting high-inflation of zeros and shows that one cannot use the GAMLSS approach in some practical situations due to numerical problems.

MSC:

62J02 General nonlinear regression
60G55 Point processes (e.g., Poisson, Cox, Hawkes processes)
62P10 Applications of statistics to biology and medical sciences; meta analysis
62G08 Nonparametric regression and quantile regression

Software:

R; GAMLSS
Full Text: DOI

References:

[1] Atkinson, AC, Plots, Transformations and Regression (1985), Oxford: Oxford University Press, Oxford · Zbl 0582.62065
[2] Barreto-Souza, W.; Simas, AB, General mixed Poisson regression models with varying dispersion, Stat. Comput., 26, 1263-1280 (2016) · Zbl 1505.62050 · doi:10.1007/s11222-015-9601-6
[3] Böhning, D.; Dietz, E.; Schlattmann, P.; Mendonça, L.; Kirchner, U., The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology, J. R. Stat. Soc. Ser. A, 162, 195-209 (1999) · doi:10.1111/1467-985X.00130
[4] Cameron, AC; Trivedi, PK, Regression Analysis of Count Data (1998), Cambridge: Cambridge University Press, Cambridge · Zbl 0924.62004
[5] Cook, RD, Detection of influential observations in linear regression, Technometrics, 19, 15-18 (1977) · Zbl 0371.62096
[6] Dean, CB; Lawless, J.; Willmot, GE, A mixed Poisson-inverse Gaussian regression model, Can. J. Stat., 17, 171-182 (1989) · Zbl 0679.62051 · doi:10.2307/3314846
[7] Dean, CB; Nielsen, JD, Generalized linear mixed models: a review and some extensions, Lifetime Data Anal., 13, 497-512 (2007) · Zbl 1331.62361 · doi:10.1007/s10985-007-9065-x
[8] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. R. Stat. Soc. Ser. B, 39, 1-38 (1977) · Zbl 0364.62022
[9] Famoye, F.; Singh, KP, Zero-inflated generalized Poisson regression model with an application to domestic violence data, J. Data Sci., 4, 117-130 (2006)
[10] Garay, AM; Hashimoto, EM; Ortega, EM; Lachos, VH, On estimation and influence diagnostics for zero-inflated negative binomial regression models, Comput. Stat. Data Anal., 55, 1304-1318 (2011) · Zbl 1328.65029 · doi:10.1016/j.csda.2010.09.019
[11] Hall, D., Zero-inflated Poisson and binomial regression with random effects: a case study, Biometrics, 56, 1030-1039 (2000) · Zbl 1060.62535 · doi:10.1111/j.0006-341X.2000.01030.x
[12] Hilbe, JM, Negative Binomial Regression (2008), New York: Cambridge University Press, New York
[13] Hinde, J.; Demétrio, CGB, Overdispersion: models and estimation, Comput. Stat. Data Anal., 27, 151-170 (1998) · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3
[14] Holla, MS, On a Poisson-inverse Gaussian distribution, Metrika, 11, 115-121 (1966) · Zbl 0156.40402 · doi:10.1007/BF02613581
[15] Karlis, D.; Xekalaki, E., Mixed Poisson distributions, Int. Stat. Rev., 73, 35-58 (2005) · Zbl 1104.62010 · doi:10.1111/j.1751-5823.2005.tb00250.x
[16] Lambert, D., Zero-inflated Poisson regression, with an application to defects in manufacturing, Technometrics, 34, 1-14 (1992) · Zbl 0850.62756 · doi:10.2307/1269547
[17] Lawless, JF, Negative binomial and mixed Poisson regression, Can. J. Stat., 15, 209-225 (1987) · Zbl 0632.62060 · doi:10.2307/3314912
[18] Lee, AH; Wang, K.; Yau, KK, Analysis of zero-inflated Poisson data incorporating extent of exposure, Biometr. J., 43, 963-975 (2001) · Zbl 0989.62063 · doi:10.1002/1521-4036(200112)43:8<963::AID-BIMJ963>3.0.CO;2-K
[19] Li, CS; Lu, JC; Park, J.; Kim, K.; Brinkley, PA; Peterson, JP, Multivariate zero-inflated Poisson models and their applications, Technometrics, 41, 29-38 (1999) · doi:10.1080/00401706.1999.10485593
[20] Lim, HK; Li, WK; Yu, PLH, Zero-inflated Poisson regression mixture model, Comput. Stat. Data Anal., 71, 151-158 (2014) · Zbl 1471.62116 · doi:10.1016/j.csda.2013.06.021
[21] Louis, TA, Finding the observed information matrix when using the EM algorithm, J. R. Stat. Soc., 44, 226-233 (1982) · Zbl 0488.62018
[22] Mwalili, SM; Lesaffre, E.; Declerck, D., The zero-inflated negative binomial regression model with correction for misclassification: an example in caries research, Stat. Methods Med. Res., 17, 123-139 (2008) · Zbl 1157.62042 · doi:10.1177/0962280206071840
[23] Oliveira, M.; Einbeck, J.; Higueras, M.; Ainsbury, E.; Puig, P.; Rothkamm, K., Zero-inflated regression models for radiation-induced chromosome aberration data: a comparative study, Biometr. J., 58, 259-279 (2016) · Zbl 1381.62280 · doi:10.1002/bimj.201400233
[24] R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2016)
[25] Ridout, M.S., Demétrio, C.G.B., Hinde, J.P.: Models for count data with many zeros. In: Proceedings of the XIXth International Biometrics Conference, Cape Town, Invited Papers, pp. 179-192 (1998)
[26] Ridout, M.; Hinde, J.; Demétrio, CGB, A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives, Biometrics, 57, 219-223 (2001) · Zbl 1209.62079 · doi:10.1111/j.0006-341X.2001.00219.x
[27] Rigby, RA; Stasinopoulos, DM, Generalized additive models for location, scale and shape (with discussion), Appl. Stat., 54, 507-554 (2005) · Zbl 1490.62201
[28] Shankar, V.; Milton, J.; Mannering, F., Modelling accident frequencies as zero-altered probability processes: An empirical inquiry, Accid. Anal. Prev., 29, 829-837 (1997) · doi:10.1016/S0001-4575(97)00052-3
[29] Sichel, H.S.: On a family of discrete distributions particularly suited to represent long-tailed frequency data. In: Proceedings of the Third Symposium on Mathematical Statistics, Pretoria, CSIR, pp. 51-97 (1971) · Zbl 0274.60012
[30] Willmot, GE, The Poisson-inverse Gaussian distribution as an alternative to the negative binomial, Scand. Actuar. J., 20, 113-127 (1989)
[31] Wu, CFJ, On the convergence properties of the EM algorithm, Ann. Stat., 11, 95-103 (1983) · Zbl 0517.62035 · doi:10.1214/aos/1176346060
[32] Yau, KKW; Wang, K.; Lee, AH, Zero-inflated negative binomial mixed regression modelling of over-dispersed count data with extra zeros, Biometr. J., 45, 437-452 (2003) · Zbl 1441.62543 · doi:10.1002/bimj.200390024
[33] Zhu, HT; Lee, SY; Wei, BC; Zhu, J., Case-deletion measures for models with incomplete data, Biometrika, 88, 727-737 (2001) · Zbl 1006.62021 · doi:10.1093/biomet/88.3.727
[34] Zhu, HT; Lee, SY, Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B, 63, 111-126 (2001) · Zbl 0976.62071 · doi:10.1111/1467-9868.00279
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.