Abstract
This chapter deals with issues in model building and the use of resampling procedures to assess model stability. Concentrating on the nonparametric bootstrap and taking material from five papers published between 1992 and 2015, procedures for variable selection, selection of the functional form for continuous variables, and treatment-covariate interactions are discussed. The methods are illustrated by using publicly available data from three randomized trials. General issues related to the selection of regression models as well as bootstrap procedures used as a pragmatic approach to gain further knowledge from clinical data are briefly outlined.
Similar content being viewed by others
References
Altman DG, Andersen PK (1989) Bootstrap investigation of the stability of a Cox regression model. Stat Med 8:771–783
Altman DG, Lausen B, Sauerbrei W, Schumacher M (1994) Dangers of using ‘optimal’ s in the evaluation of prognostic factors. J Natl Cancer Inst 86:829–835
Altman DG, McShane LM, Sauerbrei W, Taube SE (2012) Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. PLoS Med 9(5):e1001216
Ariyaratne TV, Billah B, Yap CH, Dinh D, Smith JA, Shardey GC, Reid CM (2011) An Australian risk prediction model for determining early mortality following aortic valve replacement. Eur J Cardiothorac Surg 38(6):815–821
Babu JG (2011) Resampling methods for model fitting and model selection. J Biopharm Stat 21:1177–1186
Binder H, Sauerbrei W (2009) Stability analysis of an additive spline model for respiratory health data by using knot removal. J R Stat Soc C 58:577–600
Bonetti M, Gelber RD (2004) Patterns of treatment effects in subsets of patients in clinical trials. Biostatistics 5:465–481
Boulesteix AL, Binder H, Abrahamowicz M, Sauerbrei W (2018) On the necessity and design of studies comparing statistical methods. Biom J 60(1):216–218
Breiman L (1992) The little bootstrap and other methods for dimensionality selection in regression: X-fixed prediction error. J Am Stat Assoc 87:738–754
Carpenter J, Bithell J (2000) Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med 19:1141–1164
Chen C, George SL (1985) The bootstrap and identification of prognostic factors via Cox’s proportional hazards regression model. Stat Med 4:39–46
Chernick MR (2008) Bootstrap methods. A guide for practitioners and researchers. Wiley, Hoboken
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge, MA
De Bin R, Sauerbrei W (2017) Handling co-dependence issues in resampling-based variable selection procedures: a simulation study. J Stat Comput Simul 88(1):28–55
De Bin R, Janitza S, Sauerbrei W, Boulesteix AL (2016) Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biometrics 72(1):272–280
Donegan S, Williams L, Dias S, Tudur-Smith C, Welton N (2015) Exploring treatment by covariate interactions using subgroup analysis and meta-regression in cochrane reviews: a review of recent practice. PloS one 10(6):e0128804
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Harrell FE (2001) Regression modelling strategies, with applications to linear models, logistic regression, and survival analysis. Springer, New York
Heinze G, Wallisch C, Dunkler D (2018) Variable selection – a review and recommendations for the practicing statistician. Biom J 60:431–449
Hennig C, Sauerbrei W (2019) Exploration of the variability of variable selection based on distances between bootstrap sample results. ADAC. To appear
Huebner M, Le Cessie S, Schmidt CO, Vach W (2018) A contemporary conceptual framework for initial data analysis. Obs Stud 4:171–192
Janitza S, Binder H, Boulesteix AL (2016) Pitfalls of hypothesis tests and model selection on boot- strap samples: causes and consequences in biometrical applications. Biom J 58:447–473
LePage R, Billard L (1992) Exploring the limits of bootstrap. Wiley, New York
Lusa L, McShane LM, Radmacher MD, Shih JH, Wright GW, Simon R (2007) Appropriateness of some resampling-based inference procedures for assessing performance of prognostic classifiers derived from microarray data. Stat Med 26(5):1102–1113
Medical Research Council Renal Cancer Collaborators (MRCRCC) (1999) Interferon-rx and survival in metastatic renal carcinoma: early results of a randomised controlled trial. Lancet 353:14–17
Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc B 72:417–473
Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 162(1):W1–W73
Rospleszcz S, Janitza S, Boulesteix AL (2016) Categorical variables with many categories are preferentially selected in bootstrap-based model selection procedures for multivariable regression models. Biom J 58:652–673
Royston P, Altman DG (1994) Regression using fractional polynomials of continuous covariates: parsimonious Parametic modelling. Appl Stat 43:429–467
Royston P, Sauerbrei W (2003) Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Stat Med 22:639–659
Royston P, Sauerbrei W (2004) A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Statist. Med. 23:2509–2525
Royston P, Sauerbrei W (2008) Multivariable model-building—a pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. Wiley, New York
Royston P, Sauerbrei W (2009a) Bootstrap assessment of the stability of multivariable models. Stata J 9:547–570
Royston P, Sauerbrei W (2009b) Two techniques for investigating interactions between treatment and continuous covariates in clinical trials. Stata J 9:230–251
Royston P, Sauerbrei W (2013) Interaction of treatment with a continuous variable: simulation study of significance level for several methods of analysis. Stat Med 32:3788–3803
Royston P, Sauerbrei W (2014) Interaction of treatment with a continuous variable: simulation study of power for several methods of analysis. Stat Med 33:4695–4708
Royston P, Altman DG, Sauerbrei W (2006) Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med 25:127–141
Sauerbrei W (1999) The use of resampling methods to simplify regression models in medical statistics. J R Stat Soc: Ser C: Appl Stat 48:313–329
Sauerbrei W, Royston P (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Stat Soc A Stat Soc 162:71–94
Sauerbrei W, Royston P (2007) Modelling to extract more information from clinical trials data: on some roles for the bootstrap. Stat Med 26:4989–5001
Sauerbrei W, Schumacher M (1992) A bootstrap resampling procedure for model building: application to the cox regression model. Stat Med 11:2093–2109
Sauerbrei W, Royston P, Binder H (2007a) Selection of important variables and determination of functional form for continuous predictors in multivariable model-building. Stat Med 26:5512–5528
Sauerbrei W, Royston P, Zapien K (2007b) Detecting an interaction between treatment and a continuous covariate: a comparison of two approaches. Comput Stat Data Anal 51:4054–4063
Sauerbrei W, Abrahamowicz M, Altman DG, le Cessie S, Carpenter J, on behalf of the STRATOS initiative (2014) STRengthening analytical thinking for observational studies: the STRATOS initiative. Stat Med 33:5413–5432
Sauerbrei W, Buchholz A, Boulesteix A, Binder H (2015) On stability issues in deriving multivariable regression models. Biom J 57:531–555
Schumacher M, Hollaender N, Schwarzer G, Binder H, Sauerbrei W (2012) Prognostic factor studies. In: Crowley J, Hoering A (eds) Handbook of statistics in clinical oncology, 3rd edn. Chapman and Hall/CRC, Boca Raton, pp 415–470
Sekula P, Mallett S, Altman DG, Sauerbrei W (2017) Did the reporting of prognostic studies of tumour markers improve since the introduction of REMARK guideline? A comparison of reporting in published articles. PLoS One 12(6):e0178531
Shmueli G (2010) To explain or to predict? Stat Sci 25:289–310
Verschraegen C, Vinh-Hung V, Cserni G, Gordon R, Royce ME, Vlastos G, Tai P, Storme G (2005) Modeling the effect of tumor size in early breast Cancer. Ann Surg 241:309–318
Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM (2007) Statistics in medicine—reporting of subgroup analyses in clinical trials. N Engl J Med 357(21):2189–2194
Westfall PH (2011) On using the bootstrap for multiple comparisons. J Biopharm Stat 21:1187–1205
Acknowledgment
A special thanks to Harald Binder, Anika Buchholz, Patrick Royston, and Martin Schumacher, the co-authors of the papers which were used as cornerstones for this article. We also thank Georg Heinze and Christine Wallisch for comments on an earlier version, Alethea Charlton and Jenny Lee for linguistic improvements, and Tim Haeussler, Martin Haslberger, and Andreas Ott for administrative assistance. Finally, we thank the Deutsche Forschungsgemeinschaft who supported parts of the work with grants BO3139/4–3 to ALB and SA580/8–3 to WS and with grants to projects leading to some of the earlier papers.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this entry
Cite this entry
Sauerbrei, W., Boulesteix, AL. (2020). Use of Resampling Procedures to Investigate Issues of Model Building and Its Stability. In: Piantadosi, S., Meinert, C. (eds) Principles and Practice of Clinical Trials. Springer, Cham. https://doi.org/10.1007/978-3-319-52677-5_130-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-52677-5_130-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52677-5
Online ISBN: 978-3-319-52677-5
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering