Summary
Smoothing methods that use. basis functions with penalisation can be formulated as maximum likelihood estimators and best predictors in a mixed model framework. Such connections are at least a quarter of a century old but, perhaps with the advent of mixed model software, have led to a paradigm shift in the field of smoothing. The reason is that most, perhaps all, models involving smoothing can be expressed as a mixed model and hence enjoy the benefit of the growing body of methodology and software for general mixed model analysis. The handling of other complications such as clustering, missing data and measurement error is generally quite straightforward with mixed model representations of smoothing.
Similar content being viewed by others
References
Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximations (with discussion). Journal of the American Statistical Association, 96, 939–967.
Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25.
Brumback, B.A. and Rice, J.A. (1998). Smoothing spline models for the analysis of nested and crossed samples of curves (with discussion). Journal of the American Statistical Association, 93, 961–994.
Brumback, B.A., Ruppert, D. and Wand, M.P. (1999). Comment on Shively, Kohn and Wood. Journal of the American Statistical Association, 94, 794–797.
Cai, T., Hyndman, R.J. and Wand, M.P. (2002). Mixed model-based hazard estimation. Journal of Computational and Graphical Statistics, 11, in press.
Carroll, R. J., Ruppert, D. and Stefanski, L.A. (1995). Measurement Error in Nonlinear Models. London: Chapman and Hall.
Casella, G. and Berger, R. L. (1990). Statistical Inference (Second Edition). Pacific Grove, California: Thomson Learning.
Chaudhuri, P. and Marron, J.S. (1999). SiZer for exploration of structures in curves. Journal of the American Statistical Association, 94, 807–823.
Chen, Z. (1993). Fitting multivariate regression functions by interaction spline models. Journal of the Royal Statistics Society, Series B, 55, 473–491.
Coull, B.A., Ruppert, D. and Wand, M.P. (2001). Simple incorporation of interactions into additive models. Biometrics, 57, 539–545.
Cressie, N. (1993). Statistics for Spatial Data. New York: John Wiley & Sons.
Diggle, P., Liang, K.-L. and Zeger, S. (1995). Analysis of Longitudinal Data. Oxford: Oxford University Press.
Diggle, P. (1997). Spatial and longitudinal data analysis: Two histories with a common future? In Proceedings of the Nantucket conference on Modeling Longitudinal and Spatially Correlated Data: Methods, Applications, and Future Directions. Lecture Notes in Statistics 122, Gregoire, T., Brillinger, D.R., Diggle, P.J., Rusek-Cohen, E., Warren, W.G., Wolfinger, R.D. (eds), Springer-Verlag, New York, 387–402.
Draper, N.R. and Smith, H. (1998). Applied Regression Analysis (Third Edition). New York: John Wiley & Sons.
Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties (with discussion). Statistical Science, 11, 89–121.
French, J.L., Kammann, E.E. and Wand, M.P. (2001). Comment on Ke and Wang. Journal of the American Statistical Association, 96, 1285–1288.
French, J.L. and Wand, M.P. (2002). Generalized additive models for cancer mapping with incomplete covariates. Bio statistics, to appear.
Fuller, W.A. (1987). Measurement Error Models. New York: John Wiley & Sons.
Fung, W.-K., Zhu, Z.-Y., Wei, B.-C. and He, X. (2002). Influence diagnostics and outlier tests for semiparametric mixed models. Journal of the Royal Statistical Society, Series B. 64, 565–579.
Ganguli, B., Staudenmayer, J. and Wand, M.P. (2002). Additive models with predictors subject to measurement error. Unpublished manuscript.
Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (1995). Bayesian Data Analysis. Boca Raton, Florida: Chapman and Hall.
Gilks, W.R., Richardson, S. and Spiegelhalter, D.J. (1996). Markov Chain Monte Carlo in Practice. London: Chapman and Hall.
Gray, R. J. (1992). Spline-based tests in survival analysis. Biometrics, 50, 640–652.
Green, P.J. (1985). Linear models for field trials, smoothing and cross-validation. Biometrika, 72, 523–537.
Green, P.J. (1987), Penalized likelihood for general semi-parametric regression models. International Statistical Review, 55, 245–259.
Hastie, T.J. (1996). Pseudosplines. Journal of the Royal Statistical Society, Series B, 58, 379–396.
Hastie, T.J. and Tibshirani, R.J. (1990). Generalized Additive Models. London: Chapman and Hall.
Hastie, T.J. and Tibshirani, R.J. (1993). Varying-coefficients models. Journal of the Royal Statistics Society, Series B, 55, 757–796.
Hastie, T. and Tibshirani, R.J. (2000). Bayesian backfitting. Statistical Science, 15, 196–223.
Huber, P. (1983). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35, 73–101.
Ibrahim, J.G. (1990). Incomplete data Journal of the American Statistical Association, 85, 765–769.
Ibrahim, J.G., Chen, M.H., and Lipsitz, S.R. (2001). Missing responses in generalized linear mixed models when the missing data mechanism is nonignorable. Biometrika, 88, 551–564.
James, G.M. and Hastie, T.J. (2001). Functional linear discriminant analysis for irregularly sampled curves. Journal of the Royal Statistical Society, Series B, 63, 533–550.
James, G.M., Hastie, T.J. and Sugar, C.A. (2000). Principal component models for sparse functional data. Biometrika, 87, 587–602.
Johnson, M.E., Moore, L.M. and Ylvisaker, D. (1990). Minimax and maximin distance designs. Journal of Statistical Planning and Inference, 26, 131–148.
Kammann, E.E. and Wand, M.P. (2002). Geoadditive models. Applied Statistics, 52, 1–18.
Kammann, E.E., Staudenmayer, J. and Wand, M.P. (2002). Robustness for general design mixed models using the t-distribution. Unpublished manuscript.
Ke, C. and Wang, Y. (2001). Semiparametric nonlinear mixed-effects models and their applications. Journal of the American Statistical Association, 96, 1272–1281.
Kelly, C. and Rice, J. (1990). Monotone smoothing with application to dose-response curves and the assessment of synergism. Biometrics, 46, 1071–1085.
Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.
Lange, K.L., Little, R.J.A. and Taylor, J.M.G. (1989). Robust statistical modeling using the t-distribution. Journal of the American Statistical Association, 84, 881–896.
Lin, X. and Zhang, D. (1999). Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society, Series B, 61, 381–400.
Little, R.J. and Rubin, D.B. (1987). Statistical Analysis with Missing Data. New York: John Wiley & Sons.
MathSoft Inc. (2002).
McCulloch, C.E., and Searle, S.R. (2000). Generalized, Linear, and Mixed Models. New York: John Wiley & Sons.
Ngo, L. and Wand, M.P. (2002). Smoothing with mixed model software. Submitted.
Nychka, D.W. (2000). Spatial process estimates as smoothers. In Smoothing and Regression (M. Schimek, ed.). Heidelberg: Springer-Verlag.
Nychka, D. and Saltzman, N. (1998). Design of Air Quality Monitoring Networks. In Case Studies in Environmental Statistics Nychka (D. Nychka, Cox, L., Piegorsch, W. eds.), Lecture Notes in Statistics, Springer-Verlag, 51–76.
Nychka, D., Haaland, P., O’Connell, M., Ellner, S. (1998). FUNFITS, data analysis and statistical tools for estimating functions. In Case Studies in Environmental Statistics (D. Nychka, W.W. Piegorsch, L.H. Cox, eds.), New York: Springer-Verlag, 159–179.
O’Connell, M.A. and Wolfinger, R.D. (1997). Spatial regression models, response surfaces, and process optimization. Journal of Computational and Graphical Statistics, 6, 224–241.
O’Sullivan, F. (1986). A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science, 1, 505–527.
O’Sullivan, F. (1988). Fast computation of fully automated log-density and log-hazard estimators. SIAM Journal on Scientific and Statistical Computing, 9, 363–379.
Parker, R.L. and Rice, J.A. (1985). Discussion of “Some aspects of the spline smoothing approach to nonparametric curve fitting” by B.W. Silverman. Journal of the Royal Statistical Society, Series B, 47, 40–42.
Patterson, H.D. and Thompson, R. (1973). Recovery of inter-block information when block sizes are unequal. Biometrika, 58, 545–554.
Pinheiro, J.C. and Bates, D.M. (2000). Mixed-Effects Models in S and S-PLUS. New York: Springer.
Robinson, G.K. (1991). That BLUP is a good thing: the estimation of random effects. Statistical Science, 6, 15–51.
Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. New York: John Wiley & Sons.
Ruppert, D. (2002). Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics, in press.
Ruppert, D. and Carroll, R.J. (2000). Spatially-adaptive penalties for spline fitting. Australian and New Zealand Journal of Statistics, 42, 205–224.
Ruppert, D., Wand, M. P. and Carroll, R.J. (2003). Semiparametric Regression. New York: Cambridge University Press.
SAS Institute, Inc. (2002).
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman and Hall.
Searle, S.R., Casella, G. and McCulloch, C.E. (1992). Variance Components. New York: John Wiley & Sons.
Shively, T.S., Kohn, R. and Wood, S. (1999). Variable selection and function estimation in additive nonparametric regression using a data-based prior. Journal of the American Statistical Association, 94, 777–794.
Speed, T. (1991). Comment on paper by Robinson. Statistical Science, 6, 42–44.
Stein, M.L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. New York: Springer.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, Methodological, 58, 267–288.
Verbyla, A.P. (1994). Testing linearity in generalized linear models. Contributed Pap. 17th Int. Biometric Conf., Hamilton, Aug. 8th-12th, 177.
Verbyla, A.P., Cullis, B.R., Kenward, M.G. and Welham, S.J. (1999). The analysis of designed experiments and longitudinal data by using smoothing splines (with discussion). Journal of the Royal Statistics Society, Series C, 48, 269–312.
Wahba, G. (1978). Improper priors, spline smoothing and the problem of guarding against model errors in regression. Journal of the Royal Statistical Society, Series B, 40, 364–372.
Wahba, G. (1986). Partial interaction spline models for the semiparametric estimation of functions of several variables. Computer Science and Statistics: Proceedings of the 18th Symposium on the Interface, 75–80.
Wahba, G. (1990). Spline Models for Observational Data. Philadelphia: SIAM.
Wang, Y. (1998a). Smoothing spline models with correlated random errors. Journal of the American Statistical Association, 93, 341–348.
Wang, Y. (1998b). Mixed effects smoothing spline analysis of variance. Journal of the Royal Statistical Society, Series B, 60, 159–174.
Wecker, W.E. and Ansley, C.F. (1983). The signal extraction approach to nonlinear regression and spline smoothing. Journal of the American Statistical Association, 78, 81–89.
Welsh, A.H. and Richardson, A.M. (1997). Approaches to the robust estimation of mixed models. In Handbook of Statistics, Vol. 15 (G. S. Maddala and C.R. Rao eds.), Amsterdam: Elsevier Science.
Acknowledgements
The ideas summarised in this article are the result of interaction with several of my colleagues at Harvard School of Public Health in the period 1997–2002: Babette Brumback, Tianxi Cai, Brent Coull, Jonathan French, Bhaswati Ganguli, Erin Kammann, Long Ngo, Nan Laird, Helen Parise, Louise Ryan, Misha Salganik, Joel Schwartz, John Staudenmayer, Sally Thurston, Jim Ware and Yihua Zhao. The paper has also benefited greatly from conversations with Marc Aerts, Ray Carroll, Gerda Claeskens, Ciprian Crainiceanu, Maria Durban, Jim Hobert, Robert Kohn, Xihong Lin, Mary Lindstrom, Michael O’ Connell, José Pinheiro and David Ruppert. I am grateful to Professors Trevor Hastie and Gareth James for making the spinal bone mineral density data available. Finally, thank you to participants in the Euroworkshop on Nonparametric Models (HPCFCT-2000-00041) held in Bernried, Germany in November, 2001 and for its co-organiser, Göran Kauermann, for encouraging me to write this paper. This paper was supported by U.S. National Institute of Environmental Health Sciences grant R01-ES10844-01.
Author information
Authors and Affiliations
Appendix: Demmler-Reinsch orthogonalisation
Appendix: Demmler-Reinsch orthogonalisation
If X and Z contain the fixed and random effect basis functions for a scatterplot smooth (e.g. as in Section 3.2 or Section 3.6.1) and, as shown in Section 3.2.1, penalised spline regression corresponds to the ridge regression
for some diagonal matrix D and with C = [X Z]. Here α controls the amount of smoothing and in the mixed model formulation of penalised splines \(\alpha = \sigma _\varepsilon ^2/\sigma _u^2\). Algorithm 1 allows for fast and stable calculation of (9.1).
The Cholesky decomposition applies only to nonsingular matrices. If C is ill-conditioned, it is advisable to add a small multiple of D to CTC before applying the Cholesky decomposition, so that
where δ is small, e.g., δ = 10−10.
Once the matrix A and vectors b and s have been computed, the vector of fitted for different values of α reduces to a matrix multiplication. Therefore, \({\widehat {\bf{f}}_\alpha }\) can be computed cheaply for several α values. This is particularly useful for automatic smoothing parameter selection.
Justification of Algorithm 1
Now
Since U is a square matrix, UT = U−1 and so
Also,
and consequently
Hence
where A ≡ CR−1U and b ≡ ATy.
The new expression for \({\widehat {\bf{f}}_\alpha }\) is thus of the form
Comparison with (9.1) shows that we have effectively replaced the basis functions in C with those in A where this design matrix has the orthogonality property ATA = I. The columns of A correspond to the Demmler-Reinsch basis for the vector space spanned by C. The orthogonality property is crucial for the fast computation over several smoothing parameters.
Rights and permissions
About this article
Cite this article
Wand, M.P. Smoothing and mixed models. Computational Statistics 18, 223–249 (2003). https://doi.org/10.1007/s001800300142
Published:
Issue Date:
DOI: https://doi.org/10.1007/s001800300142