×

Multivariable regression model building by using fractional polynomials: description of SAS, STATA and R programs. (English) Zbl 1445.62008

Summary: In fitting regression models data analysts are often faced with many predictor variables which may influence the outcome. Several strategies for selection of variables to identify a subset of ‘important’ predictors are available for many years. A further issue to model building is how to deal with non-linearity in the relationship between outcome and a continuous predictor. Traditionally, for such predictors either a linear functional relationship or a step function after grouping is assumed. However, the assumption of linearity may be incorrect, leading to a misspecified final model. For multivariable model building a systematic approach to investigate possible non-linear functional relationships based on fractional polynomials and the combination with backward elimination was proposed recently. So far a program was only available in Stata, certainly preventing a more general application of this useful procedure. The approach will be introduced, advantages will be shown in two examples, a new approach to present FP functions will be illustrated and a macro in SAS will be shortly introduced. Differences to Stata and R programs are noted.

MSC:

62-08 Computational methods for problems pertaining to statistics
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

SAS; Stata; R
Full Text: DOI

References:

[1] Altman, D. G.; Lausen, B.; Sauerbrei, W.; Schumacher, M., The dangers of using optimal cutpoints in the evaluation of prognostic factors, J. Nat. Cancer Inst., 86, 829-835 (1994)
[2] Ambler, G.; Royston, P., Fractional polynomial model selection procedures: investigation of type I error rate, J. Statist. Simulation Comput., 69, 89-108 (2001) · Zbl 1016.62079
[3] Austin, P.; Brunner, L., Inflation of the type I error rate when a continuous confounding variable is categorized in logistic regression analyses, Statist. Med., 23, 1159-1178 (2004)
[4] Becher, H., The concept of residual confounding in regression models and some applications, Statist. Med., 11, 1747-1758 (1992)
[5] Dales, L. G.; Ury, H. K., An improper use of statistical significance testing in study covariables, Internat. J. Epidemiol., 4, 373-375 (1978)
[6] Holländer, N., Schumacher, M., 2005. Estimating the functional form of continuous covariates effect on survival time. Comput. Statist. Data Anal., in press, doi:10.1016/j.csda.2004.11.008; Holländer, N., Schumacher, M., 2005. Estimating the functional form of continuous covariates effect on survival time. Comput. Statist. Data Anal., in press, doi:10.1016/j.csda.2004.11.008
[7] Johnson, R. W., Fitting percentage of body fat to simple body measurements, J. Statist. Education, 4, 1 (1996), See also \(\langle\) http://www.amstat.org/publications/jse/v4n1/datasets.johnson.html \(\rangle \)
[8] Lausen, B.; Schumacher, M., Evaluating the effect of optimized cut-off values in the assessment of prognostic factors, Comput. Statist. Data Anal., 21, 307-326 (1996) · Zbl 0875.62567
[9] Mantel, N., Why stepdown procedures in variable selection, Technometrics, 12, 621-625 (1970)
[10] Marcus, R.; Peritz, E.; Gabriel, K., On closed test procedures with special reference to ordered analysis of variance, Biometrika, 76, 655-660 (1976) · Zbl 0353.62037
[11] Meier-Hirmer, C., Ortseifen, C., Sauerbrei, W., 2003. Multivariable fractional polynomials in SAS—an algorithm for determining the transformation of continous covariates and selection of covariates. See \(\langle;\) http://www.imbi.uni-freiburg.de/download/mfp/\( \rangle;\); Meier-Hirmer, C., Ortseifen, C., Sauerbrei, W., 2003. Multivariable fractional polynomials in SAS—an algorithm for determining the transformation of continous covariates and selection of covariates. See \(\langle;\) http://www.imbi.uni-freiburg.de/download/mfp/\( \rangle;\)
[12] Mickey, R. M.; Greenland, S., The impact of confounder selection criteria on effect estimation, Amer. J. Epidemiol., 129, 125-137 (1989)
[13] Miller, A. J., Subset Selection in Regression (1990), Chapman & Hall: Chapman & Hall New York · Zbl 0702.62057
[14] Miller, R.; Siegmund, D., Maximally selected chi-square statistics, Biometrics, 38, 1011-1016 (1982) · Zbl 0502.62091
[15] Penrose, K. W.; Nelson, A. G.; Fisher, A. G., Generalized body composition prediction equation for men using simple measurement techniques, Med. Sci. Sports Exercise, 17, 189 (1985)
[16] R Development Core Team, 2004. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, ISBN 3-900051-00-3, Vienna, Austria, \( \langle;\) http://www.R-project.org \(\rangle;\); R Development Core Team, 2004. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, ISBN 3-900051-00-3, Vienna, Austria, \( \langle;\) http://www.R-project.org \(\rangle;\)
[17] Rosenberg, P. S.; Katki, H.; Swanson, C. A.; Brown, L. M.; Wacholder, S.; Hoover, R. N., Quantifying epidemiologic risk factors using non-parametric regression: model selection remains the greatest challenge, Statist. Med., 22, 3369-3381 (2003)
[18] Royston, P.; Altman, D. G., Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling (with discussion), Appl. Statist., 43, 3, 429-467 (1994)
[19] Royston, P.; Ambler, G., Multivariable fractional polynomials: update, Stata Technical Bull., 49, 17-23 (1999)
[20] Royston, P.; Ambler, G.; Sauerbrei, W., The use of fractional polynomials to model continuous risk variables in epidemiology, Internat. J. Epidemiol., 28, 964-974 (1999)
[21] Royston, P.; Sauerbrei, W., Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation, Statist. Med., 22, 639-659 (2003)
[22] Royston, P., Sauerbrei, W., 2004a. Improving the robustness of fractional polynomial models by preliminary covariate transformation. Submitted for publication.; Royston, P., Sauerbrei, W., 2004a. Improving the robustness of fractional polynomial models by preliminary covariate transformation. Submitted for publication. · Zbl 1162.62387
[23] Royston, P.; Sauerbrei, W., A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials, Statist. Med., 23, 2509-2525 (2004)
[24] Royston, P., Sauerbrei, W., 2005. Building multivariable regression models with continuous covariates in clinical epidemiology, with an emphasis on fractional polynomials. Methods Inform. Med., to appear.; Royston, P., Sauerbrei, W., 2005. Building multivariable regression models with continuous covariates in clinical epidemiology, with an emphasis on fractional polynomials. Methods Inform. Med., to appear.
[25] Sauerbrei, W., 1993. Comparison of variable selection procedures in regression models a simulation study and practical examples. In: Michaelis, J., et al. (Eds.), Europaeische Perspektiven der Medizinischen Informatik, Biometrie und Epidemiologie. MMV Muenchen, pp. 108-113.; Sauerbrei, W., 1993. Comparison of variable selection procedures in regression models a simulation study and practical examples. In: Michaelis, J., et al. (Eds.), Europaeische Perspektiven der Medizinischen Informatik, Biometrie und Epidemiologie. MMV Muenchen, pp. 108-113.
[26] Sauerbrei, W., The use of resampling methods to simplify regression models in medical statistics, Appl. Statist., 48, 313-329 (1999) · Zbl 0939.62114
[27] Sauerbrei, W.; Royston, P., Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials, J. Roy. Statist. Soc. (Ser. A), 162, 71-94 (1999)
[28] Sauerbrei, W.; Royston, P., Corrigendum: building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials, J. Roy. Statist. Soc. (Ser. A), 162, 399-400 (2002)
[29] Sauerbrei, W., Royston, P., Bojar, H., Schmoor, C., Schumacher, M., the German Breast Cancer Study Group, 1999. Modelling the effects of standard prognostic factors in node positive breast cancer. Br. J. Cancer 79, 1752-1760.; Sauerbrei, W., Royston, P., Bojar, H., Schmoor, C., Schumacher, M., the German Breast Cancer Study Group, 1999. Modelling the effects of standard prognostic factors in node positive breast cancer. Br. J. Cancer 79, 1752-1760.
[30] Stata Corp, 2003. Stata Reference Manual, Version 8. Stata Press, College Station, TX.; Stata Corp, 2003. Stata Reference Manual, Version 8. Stata Press, College Station, TX.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.