Abstract
This paper presents an EM algorithm for maximum likelihood estimation in generalized linear models with overdispersion. The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully non-parametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters may be sensitive to the specification of a parametric form for the mixing distribution. A listing of a GLIM4 algorithm for fitting the overdispersed binomial logit model is given in an appendix.
A simple method is given for obtaining correct standard errors for parameter estimates when using the EM algorithm.
Several examples are discussed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abramowitz, M. and Stegun, I. A. (eds) (1964) Handbook of Mathematical Functions. National Bureau of Standards, Washington DC.
Aitkin, M. (1995) Probability model choice in single samples from exponential families using Poisson log-linear modelling, and model comparison using Bayes and posterior Bayes factors. Statistics and Computing, 5, 113–20.
Aitkin, M. (1996) A general maximum likelihood analysis of variance components in generalized linear models. Submitted.
Aitkin, M. and Aitkin, I. (1996) A hybrid EM/Gauss-Newton algorithm for maximum likelihood in mixture distributions. Statistics and Computing (to appear).
Aitkin, M., Anderson, D. A., Francis, B. J. and Hinde, J. P. (1989) Statistical Modelling in GLIM. Oxford University Press.
Aitkin, M. and Francis, B. J. (1995) Fitting overdispersed generalized linear models by nonparametric maximum likelihood. GLIM Newsletter, 25, 37–45.
Aitkin, M. and Tunnicliffe Wilson, G. T. (1980) Mixture models, outliers and the EM algorithm. Technometrics, 22, 325–31.
Anderson, D. A. (1988) Some models for overdispersed binomial data. Aust. J. Statist., 30, 125–48.
Anderson, D. A. and Aitkin, M. (1985) Variance component models with binary response: interviewer variability. J. Roy. Statist. Soc. B 47, 203–10.
Anderson, D. A. and Hinde, J. P. (1988) Random effects in generalized linear models and the EM algorithm. Commun. Statist.-Theory Meth., 17, 3847–56.
Barry, J. T., Francis, B. J. and Davies, R. B.(1989) SABRE: software for the analysis of binary recurrent events. In Statistical Modelling, Springer-Verlag, New York.
Bock, R. D. and Aitkin, M. (1981) Marginal maximum likelihood estimation of item parameters: an application of an EM algorithm. Psychometrika, 46, 443–59.
Böhning, D., Schlattman, P. and Lindsay, B. (1992) Computerassisted analysis of mixtures (C.A.MAN): statistical algorithms. Biometrics, 48, 285–303.
Breslow, N. (1984) Extra-Poisson variation in log-linear models. Appl. Statist., 33, 38–44.
Breslow, N. (1989) Score tests in overdispersed GLMs. In Statistical Modelling, Springer-Verlag, New York.
Breslow, N. (1990) Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. J. Amer. Statist. Assoc., 85, 565–71.
Brownlee, K. A. (1965) Statistical Theory and Methodology in Science and Engineering (2nd edn). Wiley, New York.
Crouch, E. A. C. and Spiegelman, D. (1990) The evaluation of integrals of the form ∫-∞/+∞(t) exp(-t 2)dt: application to logistic-normal models. J. Amer. Statist. Assoc., 85, 464–9.
Davies, R. B. (1987) Mass point methods for dealing with nuisance parameters in longitudinal studies. In: R. Crouchley, ed. Longitudinal Data Analysis. Avebury, Aldershot, Hants.
Dean, C. B. (1992) Testing for overdispersion in Poisson and binomial regression models. J. Amer. Statist. Assoc., 87, 451–7.
Dempster, A. P., Laird, N. M. and Rubin D. A. (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with Discussion). J. Roy. Statist. Soc. B, 39, 1–38.
Dietz, E. (1992) Estimation of heterogeneity-a GLM approach. In Advances in GLIM and Statistical Modelling. Springer-Verlag, New York.
Dietz, E. and Böhning, D. (1995) Statistical inference based on a general model of unobserved heterogeneity. In Statistical Modelling. Springer-Verlag, New York.
Efron, B. (1986) Double exponential families and their use in generalized linear regression. J. Amer. Statist. Assoc., 81, 709–21.
Ezzet, F. and Davies, R. B. (1988) A manual for MIXTURE. Centre for Applied Statistics, Lancaster, UK.
Feigl, P. and Zelen, M. (1965) Estimation of exponential probabilities with concomitant information. Biometrics, 21, 826–38.
Follman, D. A. and Lambert, D. (1989) Generalizing logistic regression by nonparametric mixing. J. Amer. Statist. Assoc., 84, 295–300.
Francis, B. J., Green, M. and Payne, C. (eds) (1993) The GLIM System: Release 4 Manual. Clarendon Press, Oxford.
Heckman, J. J. and Singer, B. (1984) A method for minimizing the impact of distributional assumptions in econometric models of duration. Econometrica, 52, 271–320.
Hinde, J. P. (1982) Compound Poisson regression models. In R. Gilchrist, ed. GLIM 82 Springer-Verlag, New York.
Hinde, J. P. and Wood, A. T. A. (1987) Binomial variance component models with a non-parametric assumption concerning random effects. In R. Crouchley, ed. Longitudinal Data Analysis. Avebury, Aldershot, Hants.
Kiefer, J. and Wolfowitz, J. (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many nuisance parameters. Ann. Math. Statist., 27, 887–906.
Laird, N. M. (1978) Nonparametric maximum likelihood estimation of a mixing distribution. J. Amer. Statist. Assoc., 73, 805–11.
Lesperance, M. L. and Kalbfleisch, J. D. (1992) An algorithm for computing the non-parametric MLE of a mixing distribution. J. Amer. Statist. Assoc., 87, 120–6.
Lindsay, B. G. (1983) The geometry of mixture likelihoods, part I: a general theory. Ann. Statist., 11, 86–94.
Louis, T. A. (1982) Finding the observed information matrix when using the EM algorithm. J. Roy. Statist. Soc., B, 44, 226–33.
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models. Chapman & Hall, London.
Moore, D. F. (1987) Modelling the extraneous variance in the presence of extrabinomial variation. Appl. Statist., 36, 8–14.
Nelder, J. A. (1985) Quasi-likelihood and GLIM. In R. Gilchrist, B. Francis and J. Whittaker, eds, Generalized Linear Models Springer-Verlag, Berlin.
Williams, D. A. (1982) Extra-binomial variation in logistic linear models. Appl. Statist;., 31, 144–8.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Aitkin, M. A general maximum likelihood analysis of overdispersion in generalized linear models. Stat Comput 6, 251–262 (1996). https://doi.org/10.1007/BF00140869
Issue Date:
DOI: https://doi.org/10.1007/BF00140869