Skip to main content

Use of Resampling Procedures to Investigate Issues of Model Building and Its Stability

  • Living reference work entry
  • First Online:
Principles and Practice of Clinical Trials

Abstract

This chapter deals with issues in model building and the use of resampling procedures to assess model stability. Concentrating on the nonparametric bootstrap and taking material from five papers published between 1992 and 2015, procedures for variable selection, selection of the functional form for continuous variables, and treatment-covariate interactions are discussed. The methods are illustrated by using publicly available data from three randomized trials. General issues related to the selection of regression models as well as bootstrap procedures used as a pragmatic approach to gain further knowledge from clinical data are briefly outlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  • Altman DG, Andersen PK (1989) Bootstrap investigation of the stability of a Cox regression model. Stat Med 8:771–783

    Article  Google Scholar 

  • Altman DG, Lausen B, Sauerbrei W, Schumacher M (1994) Dangers of using ‘optimal’ s in the evaluation of prognostic factors. J Natl Cancer Inst 86:829–835

    Article  Google Scholar 

  • Altman DG, McShane LM, Sauerbrei W, Taube SE (2012) Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. PLoS Med 9(5):e1001216

    Article  Google Scholar 

  • Ariyaratne TV, Billah B, Yap CH, Dinh D, Smith JA, Shardey GC, Reid CM (2011) An Australian risk prediction model for determining early mortality following aortic valve replacement. Eur J Cardiothorac Surg 38(6):815–821

    Article  Google Scholar 

  • Babu JG (2011) Resampling methods for model fitting and model selection. J Biopharm Stat 21:1177–1186

    Article  MathSciNet  Google Scholar 

  • Binder H, Sauerbrei W (2009) Stability analysis of an additive spline model for respiratory health data by using knot removal. J R Stat Soc C 58:577–600

    Article  MathSciNet  Google Scholar 

  • Bonetti M, Gelber RD (2004) Patterns of treatment effects in subsets of patients in clinical trials. Biostatistics 5:465–481

    Article  MATH  Google Scholar 

  • Boulesteix AL, Binder H, Abrahamowicz M, Sauerbrei W (2018) On the necessity and design of studies comparing statistical methods. Biom J 60(1):216–218

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman L (1992) The little bootstrap and other methods for dimensionality selection in regression: X-fixed prediction error. J Am Stat Assoc 87:738–754

    Article  MathSciNet  MATH  Google Scholar 

  • Carpenter J, Bithell J (2000) Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med 19:1141–1164

    Article  Google Scholar 

  • Chen C, George SL (1985) The bootstrap and identification of prognostic factors via Cox’s proportional hazards regression model. Stat Med 4:39–46

    Article  Google Scholar 

  • Chernick MR (2008) Bootstrap methods. A guide for practitioners and researchers. Wiley, Hoboken

    MATH  Google Scholar 

  • Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge, MA

    Book  MATH  Google Scholar 

  • De Bin R, Sauerbrei W (2017) Handling co-dependence issues in resampling-based variable selection procedures: a simulation study. J Stat Comput Simul 88(1):28–55

    Article  MathSciNet  Google Scholar 

  • De Bin R, Janitza S, Sauerbrei W, Boulesteix AL (2016) Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biometrics 72(1):272–280

    Article  MathSciNet  MATH  Google Scholar 

  • Donegan S, Williams L, Dias S, Tudur-Smith C, Welton N (2015) Exploring treatment by covariate interactions using subgroup analysis and meta-regression in cochrane reviews: a review of recent practice. PloS one 10(6):e0128804

    Article  Google Scholar 

  • Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    Article  MathSciNet  MATH  Google Scholar 

  • Harrell FE (2001) Regression modelling strategies, with applications to linear models, logistic regression, and survival analysis. Springer, New York

    MATH  Google Scholar 

  • Heinze G, Wallisch C, Dunkler D (2018) Variable selection – a review and recommendations for the practicing statistician. Biom J 60:431–449

    Article  MathSciNet  MATH  Google Scholar 

  • Hennig C, Sauerbrei W (2019) Exploration of the variability of variable selection based on distances between bootstrap sample results. ADAC. To appear

    Google Scholar 

  • Huebner M, Le Cessie S, Schmidt CO, Vach W (2018) A contemporary conceptual framework for initial data analysis. Obs Stud 4:171–192

    Google Scholar 

  • Janitza S, Binder H, Boulesteix AL (2016) Pitfalls of hypothesis tests and model selection on boot- strap samples: causes and consequences in biometrical applications. Biom J 58:447–473

    Article  MathSciNet  MATH  Google Scholar 

  • LePage R, Billard L (1992) Exploring the limits of bootstrap. Wiley, New York

    MATH  Google Scholar 

  • Lusa L, McShane LM, Radmacher MD, Shih JH, Wright GW, Simon R (2007) Appropriateness of some resampling-based inference procedures for assessing performance of prognostic classifiers derived from microarray data. Stat Med 26(5):1102–1113

    Article  MathSciNet  Google Scholar 

  • Medical Research Council Renal Cancer Collaborators (MRCRCC) (1999) Interferon-rx and survival in metastatic renal carcinoma: early results of a randomised controlled trial. Lancet 353:14–17

    Article  Google Scholar 

  • Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc B 72:417–473

    Article  MathSciNet  MATH  Google Scholar 

  • Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 162(1):W1–W73

    Article  Google Scholar 

  • Rospleszcz S, Janitza S, Boulesteix AL (2016) Categorical variables with many categories are preferentially selected in bootstrap-based model selection procedures for multivariable regression models. Biom J 58:652–673

    Article  MathSciNet  MATH  Google Scholar 

  • Royston P, Altman DG (1994) Regression using fractional polynomials of continuous covariates: parsimonious Parametic modelling. Appl Stat 43:429–467

    Article  Google Scholar 

  • Royston P, Sauerbrei W (2003) Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Stat Med 22:639–659

    Article  Google Scholar 

  • Royston P, Sauerbrei W (2004) A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Statist. Med. 23:2509–2525

    Article  Google Scholar 

  • Royston P, Sauerbrei W (2008) Multivariable model-building—a pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. Wiley, New York

    MATH  Google Scholar 

  • Royston P, Sauerbrei W (2009a) Bootstrap assessment of the stability of multivariable models. Stata J 9:547–570

    Article  Google Scholar 

  • Royston P, Sauerbrei W (2009b) Two techniques for investigating interactions between treatment and continuous covariates in clinical trials. Stata J 9:230–251

    Article  Google Scholar 

  • Royston P, Sauerbrei W (2013) Interaction of treatment with a continuous variable: simulation study of significance level for several methods of analysis. Stat Med 32:3788–3803

    Article  MathSciNet  Google Scholar 

  • Royston P, Sauerbrei W (2014) Interaction of treatment with a continuous variable: simulation study of power for several methods of analysis. Stat Med 33:4695–4708

    Article  MathSciNet  Google Scholar 

  • Royston P, Altman DG, Sauerbrei W (2006) Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med 25:127–141

    Article  MathSciNet  Google Scholar 

  • Sauerbrei W (1999) The use of resampling methods to simplify regression models in medical statistics. J R Stat Soc: Ser C: Appl Stat 48:313–329

    Article  MATH  Google Scholar 

  • Sauerbrei W, Royston P (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Stat Soc A Stat Soc 162:71–94

    Article  Google Scholar 

  • Sauerbrei W, Royston P (2007) Modelling to extract more information from clinical trials data: on some roles for the bootstrap. Stat Med 26:4989–5001

    Article  MathSciNet  Google Scholar 

  • Sauerbrei W, Schumacher M (1992) A bootstrap resampling procedure for model building: application to the cox regression model. Stat Med 11:2093–2109

    Article  Google Scholar 

  • Sauerbrei W, Royston P, Binder H (2007a) Selection of important variables and determination of functional form for continuous predictors in multivariable model-building. Stat Med 26:5512–5528

    Article  MathSciNet  Google Scholar 

  • Sauerbrei W, Royston P, Zapien K (2007b) Detecting an interaction between treatment and a continuous covariate: a comparison of two approaches. Comput Stat Data Anal 51:4054–4063

    Article  MathSciNet  MATH  Google Scholar 

  • Sauerbrei W, Abrahamowicz M, Altman DG, le Cessie S, Carpenter J, on behalf of the STRATOS initiative (2014) STRengthening analytical thinking for observational studies: the STRATOS initiative. Stat Med 33:5413–5432

    Article  MathSciNet  Google Scholar 

  • Sauerbrei W, Buchholz A, Boulesteix A, Binder H (2015) On stability issues in deriving multivariable regression models. Biom J 57:531–555

    Article  MathSciNet  MATH  Google Scholar 

  • Schumacher M, Hollaender N, Schwarzer G, Binder H, Sauerbrei W (2012) Prognostic factor studies. In: Crowley J, Hoering A (eds) Handbook of statistics in clinical oncology, 3rd edn. Chapman and Hall/CRC, Boca Raton, pp 415–470

    Google Scholar 

  • Sekula P, Mallett S, Altman DG, Sauerbrei W (2017) Did the reporting of prognostic studies of tumour markers improve since the introduction of REMARK guideline? A comparison of reporting in published articles. PLoS One 12(6):e0178531

    Article  Google Scholar 

  • Shmueli G (2010) To explain or to predict? Stat Sci 25:289–310

    Article  MathSciNet  MATH  Google Scholar 

  • Verschraegen C, Vinh-Hung V, Cserni G, Gordon R, Royce ME, Vlastos G, Tai P, Storme G (2005) Modeling the effect of tumor size in early breast Cancer. Ann Surg 241:309–318

    Article  Google Scholar 

  • Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM (2007) Statistics in medicine—reporting of subgroup analyses in clinical trials. N Engl J Med 357(21):2189–2194

    Article  Google Scholar 

  • Westfall PH (2011) On using the bootstrap for multiple comparisons. J Biopharm Stat 21:1187–1205

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgment

A special thanks to Harald Binder, Anika Buchholz, Patrick Royston, and Martin Schumacher, the co-authors of the papers which were used as cornerstones for this article. We also thank Georg Heinze and Christine Wallisch for comments on an earlier version, Alethea Charlton and Jenny Lee for linguistic improvements, and Tim Haeussler, Martin Haslberger, and Andreas Ott for administrative assistance. Finally, we thank the Deutsche Forschungsgemeinschaft who supported parts of the work with grants BO3139/4–3 to ALB and SA580/8–3 to WS and with grants to projects leading to some of the earlier papers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Willi Sauerbrei .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Sauerbrei, W., Boulesteix, AL. (2020). Use of Resampling Procedures to Investigate Issues of Model Building and Its Stability. In: Piantadosi, S., Meinert, C. (eds) Principles and Practice of Clinical Trials. Springer, Cham. https://doi.org/10.1007/978-3-319-52677-5_130-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-52677-5_130-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-52677-5

  • Online ISBN: 978-3-319-52677-5

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics