×

Generalized linear mixed models: a review and some extensions. (English) Zbl 1331.62361

Summary: [N. E. Breslow and D. G. Clayton, J. Am. Stat. Assoc. 88, No. 421, 9–25 (1993; Zbl 0775.62195)] was, and still is, a highly influential paper mobilizing the use of generalized linear mixed models in epidemiology and a wide variety of fields. An important aspect is the feasibility in implementation through the ready availability of related software in SAS (SAS Institute, PROC GLIMMIX, SAS Institute Inc., http://www.sas.com, 2007), S-plus (Insightful Corporation, S-PLUS 8, Insightful Corporation, Seattle, WA, http://www.insightful.com, 2007), and R (R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org, 2006) for example, facilitating its broad usage. This paper reviews background to generalized linear mixed models and the inferential techniques which have been developed for them. To provide the reader with a flavor of the utility and wide applicability of this fundamental methodology we consider a few extensions including additive models, models for zero-heavy data, and models accommodating latent clusters.

MSC:

62J12 Generalized linear models (logistic models)
62F10 Point estimation
62J99 Linear inference, regression
62P10 Applications of statistics to biology and medical sciences; meta analysis
62-02 Research exposition (monographs, survey articles) pertaining to statistics

Citations:

Zbl 0775.62195
Full Text: DOI

References:

[1] Abramowitz, M, Stegun, IA (eds) (1984) Handbook of mathematical functions with formulas, graphs, and mathematical tables. A Wiley-Interscience Publication, John Wiley, New York · Zbl 0171.38503
[2] Ainsworth L, Dean CB (2007) Detection of local and global outliers in mapping studies. Environmetrics. doi: 10.1002/env.851
[3] Barndorff-Nielsen O, Cox DR (1979) Edgeworth and saddle-point approximations with statistical applications (with discussion). J R Stat Soc Ser B: Methodol 41: 279–299 · Zbl 0424.62010
[4] Bates D, Sarkar D (2007) lme4: Linear mixed-effects models using S4 classes. R package version 0.99875–1
[5] Breslow N (1989) Score tests in overdispersed GLM’s. In: Decarli A, Francis BJ, Gilchrist R, Seeber GUH, (eds) Statistical modelling. Springer-Verlag Inc., pp 64–74
[6] Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88: 9–25 · Zbl 0775.62195
[7] Breslow NE, Lin X (1995) Bias correction in generalised linear mixed models with a single component of dispersion. Biometrika 82: 81–91 · Zbl 0823.62059 · doi:10.1093/biomet/82.1.81
[8] Casella G, George EI (1992) Explaining the Gibbs sampler. Am Stat 46: 167–174
[9] Chen J, Zhang D, Davidian M (2002) A Monte Carlo EM algorithm for generalized linear mixed models with flexible random effects distribution. Biostatistics (Oxford) 3: 347–360 · Zbl 1135.62355 · doi:10.1093/biostatistics/3.3.347
[10] Cheng KF, Wu JW (1994) Testing goodness of fit for a parametric family of link functions. J Am Stat Assoc 89: 657–664 · Zbl 0818.62019 · doi:10.1080/01621459.1994.10476790
[11] Cook RD, Weisberg S (1989) Regression diagnostics with dynamic graphics (C/R: P293–311). Technometrics 31: 277–291
[12] Davidian M, Carroll RJ (1987) Variance function estimation. J Am Stat Assoc 82: 1079–1091 · Zbl 0648.62076 · doi:10.1080/01621459.1987.10478543
[13] Dean CB, Balshaw R (1997) Efficiency lost by analyzing counts rather than event times in Poisson and overdispersed Poisson regression models. J Am Stat Assoc 92: 1387–1398 · Zbl 0913.62079 · doi:10.1080/01621459.1997.10473659
[14] Dean C, Lawless JF, Willmot GE (1989) A mixed Poisson-inverse–Gaussian regression model. Can J Stat 17: 171–181 · Zbl 0679.62051 · doi:10.2307/3314846
[15] Dubin JA, Han L, Fried TR (2007) Triggered sampling could help improve longitudinal studies of persons with elevated mortality risk. J Clin Epidemiol 60: 288–93 · doi:10.1016/j.jclinepi.2006.06.012
[16] Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman & Hall Ltd.
[17] Harville DA (1977) Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc 72: 320–338 · Zbl 0373.62040 · doi:10.1080/01621459.1977.10480998
[18] Hastie T, Tibshirani R (1999) Generalized additive models. Chapman & Hall Ltd. · Zbl 0747.62061
[19] Henderson R, Shimakura S (2003) A serially correlated gamma frailty model for longitudinal count data. Biometrika 90: 355–366 · Zbl 1034.62115 · doi:10.1093/biomet/90.2.355
[20] Insightful Corporation (2007) S-PLUS 8. Insightful Corporation, Seattle, WA. URL http://www.insightful.com , accessed on 25 October 2007
[21] Jiang J (1998) Consistent estimators in generalized linear mixed models. J Am Stat Assoc 93: 720–729 · Zbl 0926.62051 · doi:10.1080/01621459.1998.10473724
[22] Jørgensen B (1982) Statistical properties of the generalized inverse Gaussian distribution. Lecture Notes in Statistics, vol 9. Springer-Verlag, New York · Zbl 0486.62022
[23] Kleinman K, Lazarus R, Platt R (2004) A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. Am J Epidemiol 159: 217–24 · doi:10.1093/aje/kwh029
[24] Laird NM (1991) Topics in likelihood-based methods for longitudinal data analysis. Statistica Sinica 1: 33–50 · Zbl 0829.62068
[25] Laird NM, Louis TA (1982) Approximate posterior distributions for incomplete data problems. J R Stat Soc Ser B: Methodol 44: 190–200 · Zbl 0502.62031
[26] Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34: 1–14 · Zbl 0850.62756 · doi:10.2307/1269547
[27] Lawless JF (1987) Negative binomial and mixed P. Can J Stat 15: 209–225 · Zbl 0632.62060 · doi:10.2307/3314912
[28] Lawless JF, Zhan M (1998) Analysis of interval-grouped recurrent-event data using piecewise constant rate functions. Can J Stat 26: 549–565 · Zbl 0963.62063 · doi:10.2307/3315717
[29] Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22 · Zbl 0595.62110 · doi:10.1093/biomet/73.1.13
[30] Lin X, Breslow NE (1996) Bias correction in generalized linear mixed models with multiple components of dispersion. J Am Stat Assoc 91: 1007–1016 · Zbl 0882.62059 · doi:10.1080/01621459.1996.10476971
[31] Lin X, Zhang D (1999) Inference in generalized additive mixed models by using smoothing splines. J R Stat Soc Ser B: Stat Methodol 61: 381–400 · Zbl 0915.62062 · doi:10.1111/1467-9868.00183
[32] Lin X, Harlow SD, Raz J, Harlow SD (1997) Linear mixed models with heterogeneous within-cluster variances. Biometrics 53: 910–923 · Zbl 0891.62048 · doi:10.2307/2533552
[33] Lindstrom MJ, Bates DM (1988) Newton-Raphson and EM algorithms for linear mixed-effects models for repeated-measures data. J Am Stat Assoc 83: 1014–1022, Corr: 94V89, p 1572 · Zbl 0671.65119
[34] Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005) Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol Lett 8: 1235–1246 · doi:10.1111/j.1461-0248.2005.00826.x
[35] McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall Ltd. · Zbl 0744.62098
[36] McCulloch CE (1997) Maximum likelihood algorithms for generalized linear mixed models. J Am Stat Assoc 92: 162–170 · Zbl 0889.62061 · doi:10.1080/01621459.1997.10473613
[37] Nelder JA, Pregibon D (1987) An extended quasi-likelihood function. Biometrika 74: 221–232 · Zbl 0621.62078 · doi:10.1093/biomet/74.2.221
[38] Nielsen JD, Dean CB (2007) Clustered mixed nonhomogeneous Poisson process spline models for the analysis of recurrent event panel data. Biometrics. doi: 10.1111/j.1541-0420.2007.00940.x
[39] Nodtvedt A, Dohoo I, Sanchez J, Conboy G, DesCjteaux L, Keefe G, Leslie K, Campbell J (2002) The use of negative binomial modelling in a longitudinal study of gastrointestinal parasite burdens in Canadian dairy cows. Can J Vet Res 66: 249–257
[40] Pierce DA, Schafer DW (1986) Residuals in generalized linear models. J Am Stat Assoc 81: 977–986 · Zbl 0644.62076 · doi:10.1080/01621459.1986.10478361
[41] Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer-Verlag Inc. · Zbl 0953.62065
[42] R Development Core Team (2006) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org , accessed on 25 October 2007
[43] Raudenbush SW, Yang M-L, Yosef M (2000) Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. J Comput Graph Stat 9: 141–157
[44] Rich-Edwards JW, Kleinman KP, Strong EF, Oken E, Gillman MW (2005) Preterm delivery in Boston before and after September 11th, 2001. Epidemiology 16: 323–327 · doi:10.1097/01.ede.0000158801.04494.52
[45] Rosen O, Jiang W, Tanner MA (2000) Mixtures of marginal models. Biometrika 87: 391–404 · Zbl 0949.62067 · doi:10.1093/biomet/87.2.391
[46] Sartori N, Severini TA (2004) Conditional likelihood inference in generalized linear mixed models. Statistica Sinica 14: 349–360 · Zbl 1045.62072
[47] SAS Institute (2007) PROC GLIMMIX. SAS Institute Inc. URL http://www.sas.com , accessed on 25 October 2007
[48] Shun Z, McCullagh P (1995) Laplace approximation of high dimensional integrals. J R Stat Soc Ser B: Methodol 57: 749–760 · Zbl 0826.41026
[49] Sichel HS (1974) On a distribution representing sentence-length in written prose. J R Stat Soc Ser A 137: 25–34 · doi:10.2307/2345142
[50] Simons JS, Neal DJ, Gaher RM (2006) Risk for marijuana-related problems among college students: an application of zero-inflated negative binomial regression. Am J Drug Alcohol Abuse 32: 41–53 · doi:10.1080/00952990500328539
[51] Song PX-K, Fan Y, Kalbfleisch JD (2005) Maximization by parts in likelihood inference. J Am Stat Assoc 100: 1145–1158 · Zbl 1117.62429 · doi:10.1198/016214505000000204
[52] Stangle DE, Smith DR, Beaudin SA, Strawderman MS, Levitsky DA, Strupp BJ (2007) Succimer chelation improves learning, attention, and arousal regulation in lead-exposed rats but produces lasting cognitive impairment in the absence of lead exposure. Environ Health Perspect 115: 201–209 · doi:10.1289/ehp.9263
[53] Tchetgen EJ, Coull BA (2006) A diagnostic test for the mixing distribution in a generalised linear mixed model. Biometrika 93: 1003–1010 · Zbl 1436.62364 · doi:10.1093/biomet/93.4.1003
[54] Tierney L, Kass RE, Kadane JB (1989) Approximate marginal densities of nonlinear functions. Biometrika 76: 425–433, Corr: V78, p233–234 · Zbl 0676.62016 · doi:10.1093/biomet/76.3.425
[55] Tjur T (1982) A connection between Rasch’s item analysis model and a multiplicative Poisson model. Scand J Stat 9: 23–30 · Zbl 0484.62115
[56] Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York. URL http://www.stats.ox.ac.uk/pub/MASS4 , accessed on 25 October 2007
[57] Vonesh EF, Wang H, Nie L, Majumdar D (2002) Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. J Am Stat Assoc 97: 271–283 · Zbl 1073.62591 · doi:10.1198/016214502753479400
[58] Waagepetersen R (2006) A simulation-based goodness-of-fit test for random effects in generalized linear mixed models. Scand J Stat 33: 721–731 · Zbl 1164.62348 · doi:10.1111/j.1467-9469.2006.00504.x
[59] Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models, and the Gauss–Newton method. Biometrika 61: 439–447 · Zbl 0292.62050
[60] White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50: 1–26 · Zbl 0478.62088 · doi:10.2307/1912526
[61] Wood S (2006) mgcv: GAMs with GCV smoothness estimation and GAMMs by REML/PQL. R package version 1.3–24
[62] Zeger SL, Karim MR (1991) Generalized linear models with random effects: a Gibbs sampling approach. J Am Stat Assoc 86: 79–86 · doi:10.1080/01621459.1991.10475006
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.