×

Comparing penalized splines and fractional polynomials for flexible modelling of the effects of continuous predictor variables. (English) Zbl 1328.65042

Summary: P(enalized)-splines and fractional polynomials (FPs) have emerged as powerful smoothing techniques with increasing popularity in applied research. Both approaches provide considerable flexibility, but only limited comparative evaluations of the performance and properties of the two methods have been conducted to date. Extensive simulations are performed to compare FPs of degree 2 (FP2) and degree 4 (FP4) and two variants of P-splines that used generalized cross validation (GCV) and restricted maximum likelihood (REML) for smoothing parameter selection. The ability of P-splines and FPs to recover the “true” functional form of the association between continuous, binary and survival outcomes and exposure for linear, quadratic and more complex, non-linear functions, using different sample sizes and signal to noise ratios is evaluated. For more curved functions FP2, the current default setting in implementations for fitting FPs in R, STATA and SAS, showed considerable bias and consistently higher mean squared error (MSE) compared to spline-based estimators and FP4, that performed equally well in most simulation settings. FPs however, are prone to artefacts due to the specific choice of the origin, while P-splines based on GCV reveal sometimes wiggly estimates in particular for small sample sizes. Application to a real dataset illustrates the different features of the two approaches.

MSC:

65C60 Computational problems in statistics (MSC2010)
62-07 Data analysis (statistics) (MSC2010)
62P10 Applications of statistics to biology and medical sciences; meta analysis

References:

[1] Ambler, G.; Royston, P., Fractional polynomial model selection procedures: investigation of type I error rate, Journal of Statistical Computation and Simulation, 69, 89-108 (2001) · Zbl 1016.62079
[2] Belitz, C., Brezger, A., Kneib, T., Lang, S., 2009. BayesX Manuals. Technical Report. Department of Statistics, University of Munich. Available at http://www.stat.uni-muenchen.de/ bayesx; Belitz, C., Brezger, A., Kneib, T., Lang, S., 2009. BayesX Manuals. Technical Report. Department of Statistics, University of Munich. Available at http://www.stat.uni-muenchen.de/ bayesx
[3] Belitz, C.; Hübner, J.; Klasen, S.; Lang, S., Determinants of the socioeconomic and spatial pattern of undernutrition by sex in India: a geoadditive semi-parametric regression approach, (Kneib, T.; Tutz, G., Statistical Modelling and Regression Structures (2010), Physika Verlag)
[4] Belitz, C.; Lang, S., Simultaneous selection of variables and smoothing parameters in structured additive regression models, Computational Statistics and Data Analysis, 53, 61-81 (2008) · Zbl 1452.62029
[5] Bender, R.; Augustin, T.; Blettner, M., Generating survival times to simulate cox proportional hazards models, Statistics in Medicine, 24, 1713-1723 (2005)
[6] Bollaerts, K.; Eilers, P.; Van Mechelen, I., Simple and multiple \(P\)-spline regression with shape constraints, British Journal of Mathematical and Statistical Psychology, 59, 451-469 (2006)
[7] Brezger, A.; Kneib, T.; Lang, S., Bayesx: analyzing bayesian structured additive regression models, Journal of Statistical Software, 14, 1-22 (2005)
[8] Brezger, A.; Lang, S., Generalized additive regression based on bayesian \(P\)-splines, Computational Statistics and Data Analysis, 50, 967-991 (2006) · Zbl 1431.62308
[9] Brezger, A.; Lang, S., Simultaneous probability statements for bayesian \(P\)-splines, Statistical Modelling, 8, 141-168 (2008) · Zbl 07257866
[10] Currie, D.; Durban, M., Flexible smoothing with \(P\)-splines: a unified approach, Statistical Modelling, 2, 333-349 (2002) · Zbl 1195.62072
[11] De Boor, C., A Practical Guide to Splines (2001), Springer: Springer New York · Zbl 0987.65015
[12] Eilers, P., Marx, B., 2010. Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, (in press).; Eilers, P., Marx, B., 2010. Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, (in press).
[13] Eilers, P. H.C.; Marx, B. D., Flexible smoothing using \(B\)-splines and penalized likelihood, Statistical Science, 11, 89-121 (1996) · Zbl 0955.62562
[14] Fahrmeir, L.; Kneib, T.; Lang, S., Penalized structured additive regression for space-time data: a Bayesian perspective, Statistica Sinica, 14, 731-761 (2004) · Zbl 1073.62025
[15] Fahrmeir, L.; Tutz, G., Multivariate Statistical Modelling Based on Generalized Linear Models (2001), Springer: Springer New York · Zbl 0980.62052
[16] Govindarajulu, U. S.; Spiegelman, D.; Thurston, S. W.; Ganguli, B.; Eisen, E. A., Comparing smoothing techniques in cox models for exposure-response relationships, Statistics in Medicine, 26, 3735-3752 (2007)
[17] Hastie, T. J.; Tibshirani, R. J., Generalized Additive Models (1990), Chapman & Hall/CRC: Chapman & Hall/CRC London · Zbl 0747.62061
[18] Hastie, T. J.; Tibshirani, R. J.; Friedman, J., The Elements of Statistical Learning (2003), Springer: Springer New York
[19] Jullion, A.; Lambert, P., Robust specification of the roughness penalty prior distribution in spatially adaptive bayesian \(P\)-splines models, Computational Statistics and Data Analysis, 51, 2542-2558 (2007) · Zbl 1161.62340
[20] Kauermann, G.; Krivobokova, T.; Fahrmeir, L., Some asymptotic results on generalized penalized spline smoothing, Journal of the Royal Statistical Society, 71, 487-503 (2009) · Zbl 1248.62055
[21] Lang, S.; Brezger, A., Bayesian \(P\)-splines, Journal of Computational and Graphical Statistics, 13, 183-212 (2004)
[22] Marx, B. D.; Eilers, P. H.C., Direct generalized additive modeling with penalized likelihood, Computational Statistics and Data Analysis, 28, 193-209 (1998) · Zbl 1042.62580
[23] Miller, A., Subset Selection in Regression (2002), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton, FL · Zbl 1051.62060
[24] Royston, P.; Sauerbrei, W., Building multivariable regression models with continuous covariates in clinical epidemiology—with an emphasis on fractional polynomials, Methods of Information in Medicine, 44, 561-571 (2005)
[25] Royston, P.; Sauerbrei, W., Multivariable Model-Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables (2008), Wiley · Zbl 1269.62053
[26] Ruppert, D.; Carroll, R. J., Spatially adaptive penalties for spline fitting, Australian and New Zealand Journal of Statistics, 42, 205-223 (2000)
[27] Ruppert, D.; Wand, M. P.; Carroll, R. J., Semiparametric Regression (2003), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1038.62042
[28] Sabanés Bové, D., Held, L., (2010). Bayesian fractional polynomials. Statistics and Computing, (in press).; Sabanés Bové, D., Held, L., (2010). Bayesian fractional polynomials. Statistics and Computing, (in press).
[29] Sauerbrei, W.; Meier-Hirmer, C.; Benner, A.; Royston, P., Multivariable regression model building by using fractional polynomials: description of SAS, stata and R programs, Computational Statistics and Data Analysis, 50, 3464-3485 (2006) · Zbl 1445.62008
[30] Sauerbrei, W.; Royston, P., Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials, Journal of the Royal Statistical Society A, 162, 71-94 (1999)
[31] Wand, M. P., Smoothing and mixed models, Computational Statistics, 18, 223-249 (2003) · Zbl 1050.62049
[32] Wood, S. N., Modelling and smoothing parameter estimation with multiple quadratic penalties, Journal of the Royal Statistical Society, 62, 413-428 (2000)
[33] Wood, S. N., Thin-plate regression splines, Journal of the Royal Statistical Society, 65, 95-114 (2003) · Zbl 1063.62059
[34] Wood, S. N., Stable and efficient multiple smoothing parameter estimation for generalized additive models, Journal of the American Statistical Association, 99, 673-686 (2004) · Zbl 1117.62445
[35] Wood, S. N., Generalized Additive Models: An Introduction with R (2006), Chapman & Hall · Zbl 1087.62082
[36] Wood, S. N., On confidence intervals for generalized additive models based on penalized regression splines, Australian and New Zealand Journal of Statistics, 48, 445-464 (2006) · Zbl 1110.62042
[37] Wood, S.N., 2006c. R-Manual: The mgcv package, version 1.3-22. Technical Report.; Wood, S.N., 2006c. R-Manual: The mgcv package, version 1.3-22. Technical Report.
[38] Wood, S. N., Fast stable direct fitting and smoothness selection for generalized additive models, Journal of the Royal Statistical Society, 70, 495-518 (2008) · Zbl 05563356
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.